Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Splunk Best Practices
Splunk Best Practices
Splunk Best Practices
Ebook455 pages2 hours

Splunk Best Practices

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book will give you an edge over others through insights that will help you in day-to-day instances. When you're working with data from various sources in Splunk and performing analysis on this data, it can be a bit tricky. With this book, you will learn the best practices of working with Splunk.

You'll learn about tools and techniques that will ease your life with Splunk, and will ultimately save you time. In some cases, it will adjust your thinking of what Splunk is, and what it can and cannot do. To start with, you'll get to know the best practices to get data into Splunk, analyze data, and package apps for distribution. Next, you'll discover the best practices in logging, operations, knowledge management, searching, and reporting. To finish off, we will teach you how to troubleshoot Splunk searches, as well as deployment, testing, and development with Splunk.

LanguageEnglish
Release dateSep 21, 2016
ISBN9781785289415
Splunk Best Practices

Related to Splunk Best Practices

Related ebooks

Data Modeling & Design For You

View More

Related articles

Reviews for Splunk Best Practices

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Splunk Best Practices - Travis Marlette

    Table of Contents

    Splunk Best Practices

    Credits

    About the Author

    About the Reviewer

    www.PacktPub.com

    eBooks, discount offers, and more

    Why subscribe?

    Preface

    What this book covers

    What you need for this book

    Who this book is for

    Conventions

    Reader feedback

    Customer support

    Downloading the example code

    Downloading the color images of this book 

    Errata

    Piracy

    Questions

    1. Application Logging

    Loggers

    Anatomy of a log

    Log4*

    Pantheios

    Logging - logging facility for Python

    Example of a structured log

    Data types

    Structured data - best practices

    Log events

    Common Log Format

    Automatic Delimited Value Extraction (IIS/Apache) - best practice

    Manual Delimited Value Extraction with REGEX

    Step 1 - field mapping - best practice

    Step 2 - adding the field map to structure the data (props/transforms)

    Use correlation IDs - best practice

    Correlation IDs and publication transactions - best practice

    Correlation IDs and subscription transactions - best practices

    Correlation IDs and database calls - best practices

    Unstructured data

    Event breaking - best practice

    Best practices

    Configuration transfer - best practice

    Summary

    2. Data Inputs

    Agents

    Splunk Universal Forwarder

    Splunk Heavy Forwarder

    Search Head Forwarder

    Data inputs

    API inputs

    Database inputs

    Monitoring inputs

    Scripted inputs

    Custom or not

    Modular inputs

    Windows inputs

    Windows event logs / Perfmon

    Deployment server

    Know your data

    Long delay intervals with lots of data

    Summary

    3. Data Scrubbing

    Heavy Forwarder management

    Managing your Heavy Forwarder

    Manual administration

    Deployment server

    Important configuration files

    Even data distribution

    Common root cause

    Knowledge management

    Handling single- versus multi-line events

    Manipulating raw data (pre-indexing)

    Routing events to separate indexes

    Black-holing unwanted events (filtering)

    Masking sensitive data

    Pre-index data masking

    Post-index data masking

    Setting a hostname per event

    Summary

    4. Knowledge Management

    Anatomy of a Splunk search

    Root search

    Calculation/evaluation

    Presentation/action

    Best practices with search anatomy

    The root search

    Calculation/evaluation

    Presentation/action

    Knowledge objects

    Eventtype Creation

    Creation through the Splunk UI

    Creation through the backend shell

    Field extractions

    Performing field extractions

    Pre-indexing field extractions (index time)

    Post-indexing field extractions (search time)

    Creating index time field extractions

    Creating search time field extractions

    Creating field extractions using IFX

    Creation through CLI

    Summary

    5. Alerting

    Setting expectations

    Time is literal, not relative

    To quickly summarize

    Be specific

    To quickly summarize

    Predictions

    To quickly summarize

    Anatomy of an alert

    Search query results

    Alert naming

    The schedule

    The trigger

    The action

    Throttling

    Permissions

    Location of action scripts

    Example

    Custom commands/automated self-healing

    A word of warning

    Summary

    6. Searching and Reporting

    General practices

    Core fields (root search)

    _time

    Index

    Sourcetype

    Host

    Source

    Case sensitivity

    Inclusive versus exclusive

    Search modes

    Fast Mode

    Verbose Mode

    Smart Mode (default)

    Advanced charting

    Overlay

    Host CPU / MEM utilization

    Xyseries

    Appending results

    timechart

    stats

    The Week-over-Week-overlay

    Day-over-day overlay

    SPL to overlay (the hard way)

    Timewrap (the easy way)

    Summary

    7. Form-Based Dashboards

    Dashboards versus reports

    Reports

    Dashboards

    Form-based

    Drilldown

    Report/data model-based

    Search-based

    Modules

    Data input

    Chart

    Table

    Single value

    Map module

    Tokens

    Building a form-based dashboard

    Summary

    8. Search Optimization

    Types of dashboard search panel

    Raw data search panel

    Shared search panel (base search)

    Report reference panel

    Data model/pivot reference panels

    Raw data search

    Shared searching using a base search

    Creating a base search

    Referencing a base search

    Report referenced panels

    Data model/pivot referenced panels

    Special notes

    Summary

    9. App Creation and Consolidation

    Types of apps

    Search apps

    Deployment apps

    Indexer/cluster apps

    Technical add-ons

    Supporting add-ons

    Premium apps

    Consolidating search apps

    Creating a custom app

    App migrations

    Knowledge objects

    Dashboard consolidation

    Search app navigation

    Consolidating indexing/forwarding apps

    Forwarding apps

    Indexer/cluster apps

    Summary

    10. Advanced Data Routing

    Splunk architecture

    Clustering

    Search head clustering

    Indexer cluster

    Multi-site redundancy

    Leveraging load balancers

    Failover methods

    Putting it all together

    Network segments

    Production

    Standard Integration Testing (SIT)

    Quality assurance

    Development

    The DMZ (App Tier)

    The data router

    Building roads and maps

    Building the UF input/output paths

    Building the HF input/output paths

    If you build it, they will come

    Summary

    Splunk Best Practices


    Splunk Best Practices

    Copyright © 2016 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

    Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

    First published: September 2016

    Production reference: 1150916

    Published by Packt Publishing Ltd.

    Livery Place

    35 Livery Street

    Birmingham B3 2PB, UK.

    ISBN 978-1-78528-139-6

    www.packtpub.com

    Credits

    About the Author

    Travis Marlette has been working with Splunk since Splunk 4.0, and has over 7 years of statistical and analytical experience leveraging both Splunk and other technologies. He cut his teeth in the securities and equities division of the finance industry, routing stock market data and performing transactional analysis on stock market trading, as well as reporting security metrics for SEC and other federal audits.

    His specialty is in IT operational intelligence, which consists of the lions share of many major companies. Being able to report on security, system-specific, and propriety application metrics is always a challenge for any company and with the increase of IT in the modern day, having a specialist like this will become more and more prominent.

    Working in finance, Travis has experience of working to integrate Splunk with some of the newest and most complex technologies, such as:

    SAS

    HIVE

    TerraData (Data Warehouse)

    Oozie

    EMC (Xtreme IO)

    Datameer

    ZFS

    Compass

    Cisco (Security/Network)

    Platfora

    Juniper (Security and Network)

    IBM Web Sphere

    Cisco Call Manager

    Java Management Systems (JVM)

    Cisco UCS

    IBM MQ Series

    FireEye

    Microsoft Active Directory

    Snort

    Microsoft Exchange

    F5

    Microsoft – OS

    MapR (Hadoop)

    Microsoft SQL

    YARN (Hadoop)

    Microsoft SCOM

    NoSQL

    Linux (Red Hat / Cent OS)

    Oracle

    MySQL

    Nagios

    LDAP

    TACACS+

    ADS

    Kerberos

    Gigamon

    Telecom Inventory Management

    Riverbed Suite

    Endace

    Service Now

    JIRA

    Confluence

    Travis is has been certified for a series of Microsoft, Juniper, Cisco, Splunk, and network security certifications. His knowledge and experience is truly his most valued currency, and this is demonstrated by every organization that has worked with him to reach their goals.

    He has worked with Splunk installations that ingest 80 to 150 GB daily, as well as 6 TB daily, and provided value with each of the installations he’s created to the companies that he’s worked with. In addition he also knows when a project sponsor or manager requires more information about Splunk and helps them understand what Splunk is, and how it can best bring value to their organization without over-committing.

    According to Travis, Splunk is not a 'crystal ball'that's made of unicorn tears, and bottled rainbows, granting wishes and immediate gratification to the person who possesses it. It’s an IT platform that requires good resources supporting it, and is limited only by the knowledge and imagination of those resources. With the right resources, that’s a good limitation for a company to have.

    Splunk acts as a ‘Rosetta Stone’ of sorts for machines. It takes thousands of machines, speaking totally different languages all at the same time, and translates that into something a human can understand. This by itself, is powerful.

    His passion for innovating new solutions and overcoming challenges leveraging Splunk and other data science tools have been exercised and visualized every day each of his roles. Those roles are cross industry, ranging from Bank of New York and Barclay's Capital, to the Federal Government. Thus far, he and the teams he has worked with have taken each of these organizations further than they have ever been on their Splunk journey. While he continues to bring visibility, add value, consolidate tools, share work, perform predictions, and implement cost savings, he is also are often mentioned as the most resourceful, reliable, and goofy person in the organization. Travis says A new Splunk implementation is like asking your older brother to turn on a fire hose so you can get a drink of water. Once it’s on, just remember to breathe.

    About the Reviewer

    Chris Ladd is a staff sales engineer at Splunk. He has been with Splunk for three years and has been a sales engineer for more than a decade. He has earned degrees from Southwestern University and the University of Houston. He resides in Chicago.

    www.PacktPub.com

    eBooks, discount offers, and more

    Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

    At www.PacktPub.com , you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

    https://www2.packtpub.com/books/subscription/packtlib

    Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

    Why subscribe?

    Fully searchable across every book published by Packt

    Copy and paste, print, and bookmark content

    On demand and accessible via a web browser

    Preface

    Within the working world of technology, there are hundreds of thousands of different applications, all (usually) logging in different formats. As a Splunk expert, our job is make all those logs speak human, which is often the impossible task. With third-party applications that provide support, sometimes log formatting is out of our control. Take, for instance, Cisco or Juniper, or any other leading leading manufacturer.

    These devices submit structured data,specific to the manufacturer.  There are also applications that we have more influence on, which are usually custom applications built for a specific purpose by the development staff of your organization. These are usually referred to as 'Proprietary applications' or 'in-house' or 'home grown' all of which mean the same thing. 

    The logs I am referencing belong to proprietary in-house (a.k.a. home grown) applications that are often part of the middleware, and usually control some of the most mission critical services an organization can provide.

    Proprietary applications can be written in anything, but logging is usually left up to the developers for troubleshooting, and up until now the process of manually scraping log files to troubleshoot quality assurance issues and system outages has been very specific. I mean that usually, the developer(s) are the only people that truly understand what those log messages mean.

    That being said, developers often write their logs in a way that they can understand them, because ultimately it will be them doing the troubleshooting / code fixing when something severe breaks.

    As an IT community, we haven't really started taking a look at the way we log things, but instead we have tried to limit the confusion to developers, and then have them help other SMEs that provide operational support, understand what is actually happening.

    This method has been successful, but time consuming, and the true value of any SME is reducing any systems MTTR, and increasing uptime. With any system, the more transactions processed means the larger the scale of a system, which after about 20 machines, troubleshooting begins to get more complex, and time consuming with a manual process.

    The goal of this book is to give you some techniques to build a bridge in your organization. We will assume you have a base understanding of what Splunk does, so that we can provide a few tools to make your day to day life easier with Splunk and not get bogged down in the vast array of SDK's and matching languages, and API's. These tools range from intermediate to expert levels. My hope is that at least one person can take at least one concept from this book, to make their lives easier.

    What this book covers

    Chapter 1 , Application Logging, discusses where the application data comes from, and how that data gets into Splunk, and how it reacts to the data. You will develop applications, or scripts, and also learn how to adjust Splunk to handle some non-standardized logging. Splunk is as turnkey, as the data you put it into it. This means, if you have a 20-year-old application that logs unstructured data in debug mode only, your Splunk instance will not be a turnkey. With a system such a Splunk, we can quote some data science experts in saying "garbage in, garbage out".

    Chapter 2 , Data Inputs, discusses how to move on to understanding what kinds of data input Splunk uses in order to get data inputs. We see how to enable Splunk to use the methods which they have developed in data inputs. Finally, you will get a brief introduction to the data inputs for Splunk.

    Chapter 3 , Data Scrubbing, discusses how to format all incoming data to a Splunk, friendly format, pre-indexing in order to ease search querying, and knowledge management going forward.

    Chapter 4 , Knowledge management, explains some techniques of managing the incoming data to your Splunk indexers, some basics of how to leverage those knowledge objects to enhance performance when searching, as well as the pros and cons of pre and post field extraction.

    Chapter 5, Alerting, discusses the growing importance of Splunk alerting, and the different levels of doing so. In the current corporate environment, intelligent alerting, and alert 'noise' reduction are becoming more important due to machine sprawl, both horizontally and vertically. Later, we will discuss how to create intelligent alerts, manage them effectively, and also some methods of 'self-healing' that I've used in the past and the successes and consequences of such methods in order to assist in setting expectations.

    Chapter 6, Searching and Reporting, will talk about the anatomy of a search, and then some key techniques that help in real-world scenarios. Many people understand search syntax, however to use it effectively, (a.k.a to become a search ninja) is something much more evasive and continuous. We will also see real world use-cases in

    Enjoying the preview?
    Page 1 of 1