Splunk listens to your data.

Splunk listens to your data.

What story are we trying to tell?

What visualization that story best?

What is the best way to search for that data?

http://docs.splunk.com/images/Tutorial/tutorialdata.zip

Concepts
Events An event is a set of values associated with a timestamp.  It is a single entry of data or multiple lines
Host A host is the name of the physical or virtual devices where event originated The host file provides an easy way to find all data originating from a specific device
Source A source is the name of the file directory.   A source is the name of the file directory
Sourcetype Sources are classified into source types which can be either well know formats or formats defined by user.  (sourcetype=linux_syslog)
Fields Fields are searchable name and value paring that distinguishes one event from another.
Tags A tag is a knowledge object that enable you to search for events that contain particular filed values.
Index-Time At index time

Index-time processes take place between the point when the data is consumed and the point when it is written to disk.

The following processes occur during index time:

Search-TIme Search-time processes take place while a search is run, as events are collected by the search. The following processes occur at search time:

Index When data is added. Splunk software parses that data into individual events, extract timestamp applies line-breaking rule.

o count the frequency of a fields(s), use top/rare

Example: sourcetype=access_combined action=purchase | top host, itemId, product_name

Example: sourcetype=access_combined action=purchase | rare host, itemId, product_name

Use stats to calculate statistics for two or more by fields (non time-based)

Example: sourcetype=access_combined action=purchase | stats count by host, itemId, product_name

To calculate statistics with an arbitrary field as the x-axis (not_time) , use chart

  • When you use a by field, the output  is a table where each column represents a distinct value of the split-by fields

Example:  sourcetype=access_combined  action=purchase | chart sum(price) over host by itemId limit=5 useother=f

 

Host – Hostname, IP address or name of network host from which the evntes originated
Source – Name of the file, stream, or other input
Sourcetype  – Specific data type or data format

Narrow results with a search, just add attribute=value to your search:

sourcetype=access_combined status=404

Now let’s take the result of that search and sort the results by URI:

status=404 | sort – uri

We don’t just want a list of these things; we want a report that tell us the worst offenders:

status=404 | top 5 referer_domain

Use the “timechart” reporting command to create charts that display statistical trends over time, with time plotted on the x-axis of the chart.  Let’s plot count of 404s over time: 

status=404 | timechart countLet’s plot the total number of bytes for successful web accesses (status=200) over time:

status=200 | timechart sum(bytes)

During the last 24 hours, which which IPs generated the most attacks?

sourcetype=secure password fail* | top BAD_IP  

sourcetype=secure password fail* | top src

sourcetype=secure password fail* | top  limit=5 BAD_IP

sourcetype=secure password fail* | top limit=5 src

Example:  source=job_listings | where salary > industry_average

Example:  source=job_listings  salary>80000

Example: sourcetype=access_combined_wcookie     field: “clientip”

Example: sourcetype=cisco_wsa_squid field: “clientip”

sourcetype=”access_combined_wcookie” action=purchase | stats count by productId

  1. Chart revenue for the different products that were purchased yesterday

sourcetype=access_* action=purchase | timechart per_hour(price) by productName usenull=f useother=f

 

  1. Chart the number of purchases made daily for each type of product

sourcetype=access_* action=purchase | timechart span=1d count by categoryId usenull=f

 

  1. Count the total revenue made for each item sold at the shop over the course of the week
  2. This first search uses the span argument to bucket the times of the search results into 1 day increments. Then uses the sum() function to add the price for each product_name.

sourcetype=access_* action=purchase | timechart span=1d sum(price) by product_name usenull=f

  1. This second search uses the per_day() function to calculate the total of the price values for each day.

sourcetype=access_* action=purchase | timechart per_day(price) by product_name usenull=f

Both searches produce the following results table in the Statistics tab.

 

  1. Chart revenue for the different products that were purchased yesterday

sourcetype=access_* action=purchase | timechart per_hour(price) by productName usenull=f useother=f

  1. Chart the number of purchases made daily for each type of product

sourcetype=access_* action=purchase | timechart span=1d count by categoryId usenull=f

  1. Count the total revenue made for each item sold at the shop over the course of the week
  2. This first search uses the span argument to bucket the times of the search results into 1 day increments. Then uses the sum() function to add the price for each product_name.

sourcetype=access_* action=purchase | timechart span=1d sum(price) by product_name usenull=f

  1. This second search uses the per_day() function to calculate the total of the price values for each day.

sourcetype=access_* action=purchase | timechart per_day(price) by product_name usenull=f

Both searches produce the following results table in the Statistics tab.

Count errors by host and status.

sourcetype=access_combined_wcookie status>200 | chart count by host, status

Limit results to only the top three errors.

sourcetype=access_combined_wcookie status>200 | chart count by host, status limit=3

 

Remove the grouping called OTHER.

sourcetype=access_combined_wcookie status>200 | chart count by host, status limit=3 useother=f

What is a Calculated Field?  

  • Shortcut for performing repetitive, long,  or complex transformation using the eval command.
  • Must be based on an extracted field.

-Output fields from a lookup table or fields/columns generated from within a search string are not supporred.

Example:  sourcetype=”cisco_wsa_squid” | eval bandwidth = sc_bytes/1024*1024) | stats sum(bandwidth) as “Bandwidth (MB)” by usage | sort – “Bandwidth (MB)”

Search for all web appliance events [cisco_wsa_squid] during the last 24 hours.

sourcetype=cisco_wsa_squid

source=”tutorialdata.zip:*” index=main

source=”tutorialdata.zip:*” index=main “categoryid=sports”

sourcetype=”access_combined_wcookie” action=purchase| top limit=20 productId

source=”tutorialdata.zip:*” index=main sourcetype=access_*

source=”tutorialdata.zip:*” index=main sourcetype=access_combined_wcookie

sourcetype=”access_combined_wcookie” action=*

source=”tutorialdata.zip:*” sourcetype=vendor_sales

sourcetype=access_combined_wcookie action=purchase | stats count by product

sourcetype=”secure*”  fail* password

sourcetype=”access_combined_wcookie” action=purchase

sourcetype=”access_combined_wcookie” action=remove

sourcetype=”access_combined_wcookie” (action=remove OR action=purchase)

sourcetype=”access_combined_wcookie” action=* productId=* |table action, productId, status

sourcetype=”access_combined_wcookie” action=* productId=* |table action, productId, status | rename action as “Customer Action“, productId as “Product Id“, status as “HTTP Status

sourcetype=secure port “failed password” | rex “\s+(?<ports>port\s\d+)” |top src ports showperc=0

index=testinputs * sourcetype=access_combined_wcookie action=purchase

REPORTING  COMMANDS

 

  • top  -display the most common values of a field
  • rare – display the least common values of a field
  • stats  – calculates statistics on the event that match your search criteria

sourcetype=secure password fail* “Failed password” | top date_month

sourcetype=secure password fail* “Failed password” | rare date_month

sourcetype=”access_combined_wcookie” action=purchase | top productId

sourcetype=”access_combined_wcookie” action=purchase | top productId

sourcetype=”access_combined_wcookie” action=purchase | top productId limit=5

sourcetype=”access_combined_wcookie” action=purchase | top productId limit=5 countfield=”Unit Sold”

Search online transactions for purchases or lost sales events over the last 30 days.

Ans: sourcetype=access_combined (action=remove OR action=purchase)

Calculate the total value of sales as totalSales by action and product_name. Then, view the results as both a statistics table and column chart visualization.

Ans: sourcetype=access_combined (action=remove OR action=purchase)

| stats sum(price) as totalSales by action, product_name

Ensure product_name is on the x-axis, action identifies the series, and totalSales is on the y-axis.

Hint: xyseries command

Ans: sourcetype=access_combined (action=remove OR action=purchase)

| stats sum(price) as totalSales by action, product_name

| xyseries product_name, action, totalSales

Search online transactions for all HTTP status errors over the last 30 days.

Hint: status>399

Ans: sourcetype=access_combined status>399

Display customer interaction in the on-line store. Retrieve only clientip and refer_domain.

Search sourcetype=access_combined | fields clientip, referer_domain

Table commands returns a table formed by only fields in the argument list

Display the action, productId, and status of customer interaction in on-line store

Search sourcetype=access_combined action=* productId=* | table action, productId, status

rex command allows you to extract fields at search time.  Matches the value of the field against unanchored regex (Defaults to field=_raw)

Display IP address and port of potential attackers

Search sourcetype=linux_secure port “failed password” | rex “\s+(?<ports>port\s\d+)” | top src ports showperc=0

During the last 24 hours. Which IPs generated the most attacks?

Search sourcetype=linux_secure password fail* | top src

During the last week, which were the top selling products from the online store?

Search: sourcetype=access_combined action=purchase | top product name

Distinct_count, dc – returns a count of unique values for a given field.

values  – list unique values of a given field.

How many unique website have our employees visited?

Search: sourcetype=cisco_wsa_squid | stats dc(s_hostname)

Which websites have our employees accessed during the last 60 minutes?

Search: sourcetype=cisco_wsa_squid | stats list(s_hostname) by cs_username