Splunk listens to your data.
Splunk listens to your data.
What story are we trying to tell?
What visualization that story best?
What is the best way to search for that data?
http://docs.splunk.com/images/Tutorial/tutorialdata.zip
Concepts | |
Events | An event is a set of values associated with a timestamp. It is a single entry of data or multiple lines |
Host | A host is the name of the physical or virtual devices where event originated The host file provides an easy way to find all data originating from a specific device |
Source | A source is the name of the file directory. A source is the name of the file directory |
Sourcetype | Sources are classified into source types which can be either well know formats or formats defined by user. (sourcetype=linux_syslog) |
Fields | Fields are searchable name and value paring that distinguishes one event from another. |
Tags | A tag is a knowledge object that enable you to search for events that contain particular filed values. |
Index-Time | At index time
Index-time processes take place between the point when the data is consumed and the point when it is written to disk. The following processes occur during index time:
|
Search-TIme | Search-time processes take place while a search is run, as events are collected by the search. The following processes occur at search time:
|
Index | When data is added. Splunk software parses that data into individual events, extract timestamp applies line-breaking rule. |
o count the frequency of a fields(s), use top/rare
Example: sourcetype=access_combined action=purchase | top host, itemId, product_name
Example: sourcetype=access_combined action=purchase | rare host, itemId, product_name
Use stats to calculate statistics for two or more by fields (non time-based)
Example: sourcetype=access_combined action=purchase | stats count by host, itemId, product_name
To calculate statistics with an arbitrary field as the x-axis (not_time) , use chart
- When you use a by field, the output is a table where each column represents a distinct value of the split-by fields
Example: sourcetype=access_combined action=purchase | chart sum(price) over host by itemId limit=5 useother=f
Host – Hostname, IP address or name of network host from which the evntes originated |
Source – Name of the file, stream, or other input |
Sourcetype – Specific data type or data format |
Narrow results with a search, just add attribute=value to your search:
sourcetype=access_combined status=404
Now let’s take the result of that search and sort the results by URI:
status=404 | sort – uri
We don’t just want a list of these things; we want a report that tell us the worst offenders:
status=404 | top 5 referer_domain
Use the “timechart” reporting command to create charts that display statistical trends over time, with time plotted on the x-axis of the chart. Let’s plot count of 404s over time:
status=404 | timechart countLet’s plot the total number of bytes for successful web accesses (status=200) over time:
status=200 | timechart sum(bytes)
During the last 24 hours, which which IPs generated the most attacks?
sourcetype=secure password fail* | top BAD_IP
sourcetype=secure password fail* | top src
sourcetype=secure password fail* | top limit=5 BAD_IP
sourcetype=secure password fail* | top limit=5 src
Example: source=job_listings | where salary > industry_average
Example: source=job_listings salary>80000
Example: sourcetype=access_combined_wcookie field: “clientip”
Example: sourcetype=cisco_wsa_squid field: “clientip”
sourcetype=”access_combined_wcookie” action=purchase | stats count by productId
- Chart revenue for the different products that were purchased yesterday
sourcetype=access_* action=purchase | timechart per_hour(price) by productName usenull=f useother=f
- Chart the number of purchases made daily for each type of product
sourcetype=access_* action=purchase | timechart span=1d count by categoryId usenull=f
- Count the total revenue made for each item sold at the shop over the course of the week
- This first search uses the span argument to bucket the times of the search results into 1 day increments. Then uses the sum() function to add the price for each product_name.
sourcetype=access_* action=purchase | timechart span=1d sum(price) by product_name usenull=f
- This second search uses the per_day() function to calculate the total of the price values for each day.
sourcetype=access_* action=purchase | timechart per_day(price) by product_name usenull=f
Both searches produce the following results table in the Statistics tab.
- Chart revenue for the different products that were purchased yesterday
sourcetype=access_* action=purchase | timechart per_hour(price) by productName usenull=f useother=f
- Chart the number of purchases made daily for each type of product
sourcetype=access_* action=purchase | timechart span=1d count by categoryId usenull=f
- Count the total revenue made for each item sold at the shop over the course of the week
- This first search uses the span argument to bucket the times of the search results into 1 day increments. Then uses the sum() function to add the price for each product_name.
sourcetype=access_* action=purchase | timechart span=1d sum(price) by product_name usenull=f
- This second search uses the per_day() function to calculate the total of the price values for each day.
sourcetype=access_* action=purchase | timechart per_day(price) by product_name usenull=f
Both searches produce the following results table in the Statistics tab.
Count errors by host and status.
sourcetype=access_combined_wcookie status>200 | chart count by host, status
Limit results to only the top three errors.
sourcetype=access_combined_wcookie status>200 | chart count by host, status limit=3
Remove the grouping called OTHER.
sourcetype=access_combined_wcookie status>200 | chart count by host, status limit=3 useother=f
What is a Calculated Field?
- Shortcut for performing repetitive, long, or complex transformation using the eval command.
- Must be based on an extracted field.
-Output fields from a lookup table or fields/columns generated from within a search string are not supporred.
Example: sourcetype=”cisco_wsa_squid” | eval bandwidth = sc_bytes/1024*1024) | stats sum(bandwidth) as “Bandwidth (MB)” by usage | sort – “Bandwidth (MB)”
Search for all web appliance events [cisco_wsa_squid] during the last 24 hours.
sourcetype=cisco_wsa_squid
source=”tutorialdata.zip:*” index=main
source=”tutorialdata.zip:*” index=main “categoryid=sports”
sourcetype=”access_combined_wcookie” action=purchase| top limit=20 productId
source=”tutorialdata.zip:*” index=main sourcetype=access_*
source=”tutorialdata.zip:*” index=main sourcetype=access_combined_wcookie
sourcetype=”access_combined_wcookie” action=*
source=”tutorialdata.zip:*” sourcetype=vendor_sales
sourcetype=access_combined_wcookie action=purchase | stats count by product
sourcetype=”secure*” fail* password
sourcetype=”access_combined_wcookie” action=purchase
sourcetype=”access_combined_wcookie” action=remove
sourcetype=”access_combined_wcookie” (action=remove OR action=purchase)
sourcetype=”access_combined_wcookie” action=* productId=* |table action, productId, status
sourcetype=”access_combined_wcookie” action=* productId=* |table action, productId, status | rename action as “Customer Action“, productId as “Product Id“, status as “HTTP Status“
sourcetype=secure port “failed password” | rex “\s+(?<ports>port\s\d+)” |top src ports showperc=0
index=testinputs * sourcetype=access_combined_wcookie action=purchase
REPORTING COMMANDS
- top -display the most common values of a field
- rare – display the least common values of a field
- stats – calculates statistics on the event that match your search criteria
sourcetype=secure password fail* “Failed password” | top date_month
sourcetype=secure password fail* “Failed password” | rare date_month
sourcetype=”access_combined_wcookie” action=purchase | top productId
sourcetype=”access_combined_wcookie” action=purchase | top productId
sourcetype=”access_combined_wcookie” action=purchase | top productId limit=5
sourcetype=”access_combined_wcookie” action=purchase | top productId limit=5 countfield=”Unit Sold”
Search online transactions for purchases or lost sales events over the last 30 days.
Ans: sourcetype=access_combined (action=remove OR action=purchase)
Calculate the total value of sales as totalSales by action and product_name. Then, view the results as both a statistics table and column chart visualization.
Ans: sourcetype=access_combined (action=remove OR action=purchase)
| stats sum(price) as totalSales by action, product_name
Ensure product_name is on the x-axis, action identifies the series, and totalSales is on the y-axis.
Hint: xyseries command
Ans: sourcetype=access_combined (action=remove OR action=purchase)
| stats sum(price) as totalSales by action, product_name
| xyseries product_name, action, totalSales
Search online transactions for all HTTP status errors over the last 30 days.
Hint: status>399
Ans: sourcetype=access_combined status>399
Display customer interaction in the on-line store. Retrieve only clientip and refer_domain.
Search sourcetype=access_combined | fields clientip, referer_domain
Table commands returns a table formed by only fields in the argument list
Display the action, productId, and status of customer interaction in on-line store
Search sourcetype=access_combined action=* productId=* | table action, productId, status
rex command allows you to extract fields at search time. Matches the value of the field against unanchored regex (Defaults to field=_raw)
Display IP address and port of potential attackers
Search sourcetype=linux_secure port “failed password” | rex “\s+(?<ports>port\s\d+)” | top src ports showperc=0
During the last 24 hours. Which IPs generated the most attacks?
Search sourcetype=linux_secure password fail* | top src
During the last week, which were the top selling products from the online store?
Search: sourcetype=access_combined action=purchase | top product name
Distinct_count, dc – returns a count of unique values for a given field.
–values – list unique values of a given field.
How many unique website have our employees visited?
Search: sourcetype=cisco_wsa_squid | stats dc(s_hostname)
Which websites have our employees accessed during the last 60 minutes?
Search: sourcetype=cisco_wsa_squid | stats list(s_hostname) by cs_username