Splunk Best Practices
()
About this ebook
This book will give you an edge over others through insights that will help you in day-to-day instances. When you're working with data from various sources in Splunk and performing analysis on this data, it can be a bit tricky. With this book, you will learn the best practices of working with Splunk.
You'll learn about tools and techniques that will ease your life with Splunk, and will ultimately save you time. In some cases, it will adjust your thinking of what Splunk is, and what it can and cannot do. To start with, you'll get to know the best practices to get data into Splunk, analyze data, and package apps for distribution. Next, you'll discover the best practices in logging, operations, knowledge management, searching, and reporting. To finish off, we will teach you how to troubleshoot Splunk searches, as well as deployment, testing, and development with Splunk.
Related to Splunk Best Practices
Related ebooks
Advanced Splunk Rating: 5 out of 5 stars5/5Splunk Developer's Guide Rating: 0 out of 5 stars0 ratingsMastering Splunk Rating: 0 out of 5 stars0 ratingsSplunk Essentials - Second Edition Rating: 0 out of 5 stars0 ratingsSplunk Developer's Guide - Second Edition Rating: 0 out of 5 stars0 ratingsImplementing Cloud Design Patterns for AWS Rating: 0 out of 5 stars0 ratingsLearning Splunk Web Framework Rating: 0 out of 5 stars0 ratingsSplunk Certified Study Guide: Prepare for the User, Power User, and Enterprise Admin Certifications Rating: 0 out of 5 stars0 ratingsPractical Splunk Search Processing Language: A Guide for Mastering SPL Commands for Maximum Efficiency and Outcome Rating: 0 out of 5 stars0 ratingsMonitoring Elasticsearch Rating: 0 out of 5 stars0 ratingsLearning Ansible 2 - Second Edition Rating: 5 out of 5 stars5/5PowerShell Troubleshooting Guide Rating: 0 out of 5 stars0 ratingsLearning ELK Stack Rating: 0 out of 5 stars0 ratingsSplunk Operational Intelligence Cookbook - Second Edition Rating: 5 out of 5 stars5/5Implementing Splunk: Big Data Reporting and Development for Operational Intelligence Rating: 4 out of 5 stars4/5Splunk A Complete Guide - 2019 Edition Rating: 2 out of 5 stars2/5Splunk Complete Self-Assessment Guide Rating: 0 out of 5 stars0 ratingsAWS Certified Solutions Architect A Complete Guide - 2020 Edition Rating: 1 out of 5 stars1/5Implementing Azure Solutions Rating: 0 out of 5 stars0 ratingsMicrosoft Graph API A Complete Guide - 2019 Edition Rating: 1 out of 5 stars1/5Microsoft Azure Security Rating: 0 out of 5 stars0 ratingsLearning Microsoft Cognitive Services Rating: 0 out of 5 stars0 ratingsSplunk Operational Intelligence Cookbook Rating: 3 out of 5 stars3/5Mastering Elastic Stack Rating: 0 out of 5 stars0 ratingsLearning PowerShell DSC Rating: 0 out of 5 stars0 ratingsSplunk A Complete Guide - 2021 Edition Rating: 4 out of 5 stars4/5Mastering SaltStack - Second Edition Rating: 0 out of 5 stars0 ratingsSplunk A Complete Guide - 2020 Edition Rating: 0 out of 5 stars0 ratingsPKI A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratings
Data Modeling & Design For You
Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps Rating: 3 out of 5 stars3/5Data Analytics for Beginners: Introduction to Data Analytics Rating: 4 out of 5 stars4/5The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Principles of Data Science Rating: 4 out of 5 stars4/5Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch Rating: 0 out of 5 stars0 ratingsData Visualization: a successful design process Rating: 4 out of 5 stars4/5Mastering Agile User Stories Rating: 4 out of 5 stars4/5Learning Cypher Rating: 0 out of 5 stars0 ratingsThinking in Algorithms: Strategic Thinking Skills, #2 Rating: 5 out of 5 stars5/5DAX Patterns: Second Edition Rating: 5 out of 5 stars5/5Living in Data: A Citizen's Guide to a Better Information Future Rating: 4 out of 5 stars4/5Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code Rating: 0 out of 5 stars0 ratingsSpreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science Rating: 0 out of 5 stars0 ratingsQuality metrics for semantic interoperability in Health Informatics Rating: 0 out of 5 stars0 ratingsSupercharge Power BI: Power BI is Better When You Learn To Write DAX Rating: 5 out of 5 stars5/5Data Analytics with Python: Data Analytics in Python Using Pandas Rating: 3 out of 5 stars3/5Python Data Analysis Rating: 4 out of 5 stars4/5A Concise Guide to Object Orientated Programming Rating: 0 out of 5 stars0 ratingsNeural Networks: Neural Networks Tools and Techniques for Beginners Rating: 5 out of 5 stars5/5The Esri Guide to GIS Analysis, Volume 3: Modeling Suitability, Movement, and Interaction Rating: 0 out of 5 stars0 ratingsBayesian Analysis with Python Rating: 5 out of 5 stars5/5Programmable Logic Controllers Rating: 4 out of 5 stars4/5Kafka in Action Rating: 0 out of 5 stars0 ratingsGraph Databases in Action: Examples in Gremlin Rating: 0 out of 5 stars0 ratings
Reviews for Splunk Best Practices
0 ratings0 reviews
Book preview
Splunk Best Practices - Travis Marlette
Table of Contents
Splunk Best Practices
Credits
About the Author
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Application Logging
Loggers
Anatomy of a log
Log4*
Pantheios
Logging - logging facility for Python
Example of a structured log
Data types
Structured data - best practices
Log events
Common Log Format
Automatic Delimited Value Extraction (IIS/Apache) - best practice
Manual Delimited Value Extraction with REGEX
Step 1 - field mapping - best practice
Step 2 - adding the field map to structure the data (props/transforms)
Use correlation IDs - best practice
Correlation IDs and publication transactions - best practice
Correlation IDs and subscription transactions - best practices
Correlation IDs and database calls - best practices
Unstructured data
Event breaking - best practice
Best practices
Configuration transfer - best practice
Summary
2. Data Inputs
Agents
Splunk Universal Forwarder
Splunk Heavy Forwarder
Search Head Forwarder
Data inputs
API inputs
Database inputs
Monitoring inputs
Scripted inputs
Custom or not
Modular inputs
Windows inputs
Windows event logs / Perfmon
Deployment server
Know your data
Long delay intervals with lots of data
Summary
3. Data Scrubbing
Heavy Forwarder management
Managing your Heavy Forwarder
Manual administration
Deployment server
Important configuration files
Even data distribution
Common root cause
Knowledge management
Handling single- versus multi-line events
Manipulating raw data (pre-indexing)
Routing events to separate indexes
Black-holing unwanted events (filtering)
Masking sensitive data
Pre-index data masking
Post-index data masking
Setting a hostname per event
Summary
4. Knowledge Management
Anatomy of a Splunk search
Root search
Calculation/evaluation
Presentation/action
Best practices with search anatomy
The root search
Calculation/evaluation
Presentation/action
Knowledge objects
Eventtype Creation
Creation through the Splunk UI
Creation through the backend shell
Field extractions
Performing field extractions
Pre-indexing field extractions (index time)
Post-indexing field extractions (search time)
Creating index time field extractions
Creating search time field extractions
Creating field extractions using IFX
Creation through CLI
Summary
5. Alerting
Setting expectations
Time is literal, not relative
To quickly summarize
Be specific
To quickly summarize
Predictions
To quickly summarize
Anatomy of an alert
Search query results
Alert naming
The schedule
The trigger
The action
Throttling
Permissions
Location of action scripts
Example
Custom commands/automated self-healing
A word of warning
Summary
6. Searching and Reporting
General practices
Core fields (root search)
_time
Index
Sourcetype
Host
Source
Case sensitivity
Inclusive versus exclusive
Search modes
Fast Mode
Verbose Mode
Smart Mode (default)
Advanced charting
Overlay
Host CPU / MEM utilization
Xyseries
Appending results
timechart
stats
The Week-over-Week-overlay
Day-over-day overlay
SPL to overlay (the hard way)
Timewrap (the easy way)
Summary
7. Form-Based Dashboards
Dashboards versus reports
Reports
Dashboards
Form-based
Drilldown
Report/data model-based
Search-based
Modules
Data input
Chart
Table
Single value
Map module
Tokens
Building a form-based dashboard
Summary
8. Search Optimization
Types of dashboard search panel
Raw data search panel
Shared search panel (base search)
Report reference panel
Data model/pivot reference panels
Raw data search
Shared searching using a base search
Creating a base search
Referencing a base search
Report referenced panels
Data model/pivot referenced panels
Special notes
Summary
9. App Creation and Consolidation
Types of apps
Search apps
Deployment apps
Indexer/cluster apps
Technical add-ons
Supporting add-ons
Premium apps
Consolidating search apps
Creating a custom app
App migrations
Knowledge objects
Dashboard consolidation
Search app navigation
Consolidating indexing/forwarding apps
Forwarding apps
Indexer/cluster apps
Summary
10. Advanced Data Routing
Splunk architecture
Clustering
Search head clustering
Indexer cluster
Multi-site redundancy
Leveraging load balancers
Failover methods
Putting it all together
Network segments
Production
Standard Integration Testing (SIT)
Quality assurance
Development
The DMZ (App Tier)
The data router
Building roads and maps
Building the UF input/output paths
Building the HF input/output paths
If you build it, they will come
Summary
Splunk Best Practices
Splunk Best Practices
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: September 2016
Production reference: 1150916
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78528-139-6
www.packtpub.com
Credits
About the Author
Travis Marlette has been working with Splunk since Splunk 4.0, and has over 7 years of statistical and analytical experience leveraging both Splunk and other technologies. He cut his teeth in the securities and equities division of the finance industry, routing stock market data and performing transactional analysis on stock market trading, as well as reporting security metrics for SEC and other federal audits.
His specialty is in IT operational intelligence, which consists of the lions share of many major companies. Being able to report on security, system-specific, and propriety application metrics is always a challenge for any company and with the increase of IT in the modern day, having a specialist like this will become more and more prominent.
Working in finance, Travis has experience of working to integrate Splunk with some of the newest and most complex technologies, such as:
SAS
HIVE
TerraData (Data Warehouse)
Oozie
EMC (Xtreme IO)
Datameer
ZFS
Compass
Cisco (Security/Network)
Platfora
Juniper (Security and Network)
IBM Web Sphere
Cisco Call Manager
Java Management Systems (JVM)
Cisco UCS
IBM MQ Series
FireEye
Microsoft Active Directory
Snort
Microsoft Exchange
F5
Microsoft – OS
MapR (Hadoop)
Microsoft SQL
YARN (Hadoop)
Microsoft SCOM
NoSQL
Linux (Red Hat / Cent OS)
Oracle
MySQL
Nagios
LDAP
TACACS+
ADS
Kerberos
Gigamon
Telecom Inventory Management
Riverbed Suite
Endace
Service Now
JIRA
Confluence
Travis is has been certified for a series of Microsoft, Juniper, Cisco, Splunk, and network security certifications. His knowledge and experience is truly his most valued currency, and this is demonstrated by every organization that has worked with him to reach their goals.
He has worked with Splunk installations that ingest 80 to 150 GB daily, as well as 6 TB daily, and provided value with each of the installations he’s created to the companies that he’s worked with. In addition he also knows when a project sponsor or manager requires more information about Splunk and helps them understand what Splunk is, and how it can best bring value to their organization without over-committing.
According to Travis, Splunk is not a 'crystal ball'that's made of unicorn tears, and bottled rainbows, granting wishes and immediate gratification to the person who possesses it. It’s an IT platform that requires good resources supporting it, and is limited only by the knowledge and imagination of those resources
. With the right resources, that’s a good limitation for a company to have.
Splunk acts as a ‘Rosetta Stone’ of sorts for machines. It takes thousands of machines, speaking totally different languages all at the same time, and translates that into something a human can understand. This by itself, is powerful.
His passion for innovating new solutions and overcoming challenges leveraging Splunk and other data science tools have been exercised and visualized every day each of his roles. Those roles are cross industry, ranging from Bank of New York and Barclay's Capital, to the Federal Government. Thus far, he and the teams he has worked with have taken each of these organizations further than they have ever been on their Splunk journey. While he continues to bring visibility, add value, consolidate tools, share work, perform predictions, and implement cost savings, he is also are often mentioned as the most resourceful, reliable, and goofy person in the organization. Travis says A new Splunk implementation is like asking your older brother to turn on a fire hose so you can get a drink of water. Once it’s on, just remember to breathe.
About the Reviewer
Chris Ladd is a staff sales engineer at Splunk. He has been with Splunk for three years and has been a sales engineer for more than a decade. He has earned degrees from Southwestern University and the University of Houston. He resides in Chicago.
www.PacktPub.com
eBooks, discount offers, and more
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.
At www.PacktPub.com , you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Preface
Within the working world of technology, there are hundreds of thousands of different applications, all (usually) logging in different formats. As a Splunk expert, our job is make all those logs speak human, which is often the impossible task. With third-party applications that provide support, sometimes log formatting is out of our control. Take, for instance, Cisco or Juniper, or any other leading leading manufacturer.
These devices submit structured data,specific to the manufacturer. There are also applications that we have more influence on, which are usually custom applications built for a specific purpose by the development staff of your organization. These are usually referred to as 'Proprietary applications' or 'in-house' or 'home grown' all of which mean the same thing.
The logs I am referencing belong to proprietary in-house (a.k.a. home grown) applications that are often part of the middleware, and usually control some of the most mission critical services an organization can provide.
Proprietary applications can be written in anything, but logging is usually left up to the developers for troubleshooting, and up until now the process of manually scraping log files to troubleshoot quality assurance issues and system outages has been very specific. I mean that usually, the developer(s) are the only people that truly understand what those log messages mean.
That being said, developers often write their logs in a way that they can understand them, because ultimately it will be them doing the troubleshooting / code fixing when something severe breaks.
As an IT community, we haven't really started taking a look at the way we log things, but instead we have tried to limit the confusion to developers, and then have them help other SMEs that provide operational support, understand what is actually happening.
This method has been successful, but time consuming, and the true value of any SME is reducing any systems MTTR, and increasing uptime. With any system, the more transactions processed means the larger the scale of a system, which after about 20 machines, troubleshooting begins to get more complex, and time consuming with a manual process.
The goal of this book is to give you some techniques to build a bridge in your organization. We will assume you have a base understanding of what Splunk does, so that we can provide a few tools to make your day to day life easier with Splunk and not get bogged down in the vast array of SDK's and matching languages, and API's. These tools range from intermediate to expert levels. My hope is that at least one person can take at least one concept from this book, to make their lives easier.
What this book covers
Chapter 1 , Application Logging, discusses where the application data comes from, and how that data gets into Splunk, and how it reacts to the data. You will develop applications, or scripts, and also learn how to adjust Splunk to handle some non-standardized logging. Splunk is as turnkey, as the data you put it into it. This means, if you have a 20-year-old application that logs unstructured data in debug mode only, your Splunk instance will not be a turnkey. With a system such a Splunk, we can quote some data science experts in saying "garbage in, garbage out".
Chapter 2 , Data Inputs, discusses how to move on to understanding what kinds of data input Splunk uses in order to get data inputs. We see how to enable Splunk to use the methods which they have developed in data inputs. Finally, you will get a brief introduction to the data inputs for Splunk.
Chapter 3 , Data Scrubbing, discusses how to format all incoming data to a Splunk, friendly format, pre-indexing in order to ease search querying, and knowledge management going forward.
Chapter 4 , Knowledge management, explains some techniques of managing the incoming data to your Splunk indexers, some basics of how to leverage those knowledge objects to enhance performance when searching, as well as the pros and cons of pre and post field extraction.
Chapter 5, Alerting, discusses the growing importance of Splunk alerting, and the different levels of doing so. In the current corporate environment, intelligent alerting, and alert 'noise' reduction are becoming more important due to machine sprawl, both horizontally and vertically. Later, we will discuss how to create intelligent alerts, manage them effectively, and also some methods of 'self-healing' that I've used in the past and the successes and consequences of such methods in order to assist in setting expectations.
Chapter 6, Searching and Reporting, will talk about the anatomy of a search, and then some key techniques that help in real-world scenarios. Many people understand search syntax, however to use it effectively, (a.k.a to become a search ninja) is something much more evasive and continuous. We will also see real world use-cases in