OpenSearch
At GDP Labs, we leverage the OpenSearch for storing and visualizing logs data.
Glossary
DQL/Lucene: Query languages for flexible log searching.
Index Pattern: String expression that defines which indices to be accessed and visualized
Prerequisites
Ensure you have access to OpenSearch and the relevant index pattern.
Know the impacted application/service, environment (production/staging), and an approximate timeframe for the incident.
Workflow: Investigating Application Log Incidents
Identify the Incident Scope Determine the affected application/system, environment (production/staging), and time window.
Select Index Pattern Select the appropriate index pattern to specify which data sources and indices you want to query. Focus on the relevant environment—e.g., choose
gdplabs-eks-production-*for production issues.
Set the Time Range: Filter by Incident Period
Use the time picker to narrow the query to the timeframe of the incident.
Choose between relative dates (e.g., Last 24 hours) or absolute values (specific start/end).


The calendar icon and time settings make it easy to specify the precise timeframe when the incident occurred.
Field-Based Filtering: Customize the Log View
Use the list of fields in the left sidebar to view all available log fields.
Select or deselect fields you want to display, such as
message,kubernetes.namespace,kubernetes.container.name, etc.
The field selection panel separates popular fields from all available fields for easy navigation and discovery.
Apply Filters: Precise Log Selection
Add filters to focus your search (e.g., by namespace, pod, or error type).
Use operators like
is,exists, orone of.
The add filter interface allows you to set filter operators such as
is,is not,is one of,exists, anddoes not exist.
Advanced Querying: Using DQL or Lucene
For more complex searches, use Dashboard Query Language (DQL) or Lucene.
Example DQL queries:
Error logs in production
kubernetes.namespace: "production" and log.level: "error"Logs by trace or pod
trace.id: "abcd1234"Search log messages for “timeout”
message: "timeout"
Switch between DQL and Lucene as needed. This can be switched on search bar.

Analyze Log Volume Trends (Bar Chart)
Use the vertical bar chart to see log activity over time.
Click bars to zoom in on peak times or anomalies for deeper analysis.

Review Log List
The Documents tab displays all log entries sorted by
@timestamp.Scan for errors, warnings, or notable events around the incident.

Inspect Log Details
Click any log entry to see a detailed breakdown in table or JSON view.
Check for context such as error codes, deployment info, container, pod, user agent, etc.


Share Discovery Search Query
Follow these steps to share a search query to other people:
On the Discover page, create a new query by clicking the New button on the top right corner.

Write the query on the search bar.
Save the query by clicking the Save button. Set a title for the saved query.

Click the Share button, choose to generate the link as Snapshot, and click Copy link. Share the copied link.

Another option is to share the title and load the saved query by clicking the Open button.

Troubleshooting: When Logs Don’t Appear
Double-check the time range, especially when switching between "Relative" and "Absolute" mode.
Confirm the correct index pattern is selected.
Loosen filters if too few logs appear; make them stricter if there are too many.
Check for typos or syntax errors in your DQL/Lucene queries.
Ensure you have permission to view the relevant index pattern.
Best Practices for Incident Log Analysis
Start with broad time and filter scopes, then refine as you spot patterns.
Combine field-based filters and search queries for the most accurate results.
Always expand individual log entries for full context.
Use export/share features to collaborate (if supported in your platform).
Correlate log findings with other observability tools (traces, metrics, APM) if available. Search by
otelTraceIDfield to find logs from a specific OpenTelemetry trace, or search byotelSpanIDto find logs from a specific span
Quick-Reference Table: Common Log Fields
Below is a quick-reference table detailing the common log fields and their descriptions.
@timestamp
The date and time the event occurred
message
The main log message
log.level
Log severity level (info, error, etc.)
kubernetes.container.name
Name of the Docker/Kubernetes container
kubernetes.namespace
Kubernetes namespace
kubernetes.pod.name
Kubernetes pod name
agent.type
Log collector agent type (e.g., filebeat)
otelTraceID
Trace ID associated with this log entry
otelSpanID
Span ID associated with this log entry
Example Walkthrough
An incident occurs on a production service at 11:00.
Select
gdplabs-eks-production-*index pattern.Set the Absolute/Relative time range to cover around 11:00.
Filter logs with DQL:
log.level: "error"Click the spike in the bar chart around that time.
Review entries for error/warning patterns.
Expand specific entries for detailed context (trace ID, container, etc.).
Last updated
Was this helpful?