PostgreSQL Query Optimization: The Ultimate Guide to Building Efficient Queries

Ebook481 pages4 hours

PostgreSQL Query Optimization: The Ultimate Guide to Building Efficient Queries

Name: PostgreSQL Query Optimization: The Ultimate Guide to Building Efficient Queries
Brand: Apress
Rating: 4.0 (1 reviews)

By Henrietta Dombrovskaya, Boris Novikov and Anna Bailliekova

Rating: 4 out of 5 stars

4/5

()

Read preview

About this ebook

Write optimized queries. This book helps you write queries that perform fast and deliver results on time. You will learn that query optimization is not a dark art practiced by a small, secretive cabal of sorcerers. Any motivated professional can learn to write efficient queries from the get-go and capably optimize existing queries. You will learn to look at the process of writing a query from the database engine’s point of view, and know how to think like the database optimizer.

The book begins with a discussion of what a performant system is and progresses to measuring performance and setting performance goals. It introduces different classes of queries and optimization techniques suitable to each, such as the use of indexes and specific join algorithms. You will learn to read and understand query execution plans along with techniques for influencing those plans for better performance. The book also covers advanced topics such as the use of functions and procedures, dynamic SQL, and generated queries. All of these techniques are then used together to produce performant applications, avoiding the pitfalls of object-relational mappers.

What You Will Learn

Identify optimization goals in OLTP and OLAP systems
Read and understand PostgreSQL execution plans
Distinguish between short queries and long queries
Choose the right optimization technique for each query type
Identify indexes that will improve query performance
Optimize full table scans
Avoid the pitfalls of object-relational mapping systems
Optimize the entire application rather than just database queries

Who This Book Is For
IT professionals working in PostgreSQL who want to develop performant and scalable applications, anyone whose job title contains the words “database developer” or “database administrator" or who is a backend developer charged with programming database calls, and system architects involved in the overall design of application systems running against a PostgreSQL database

Skip carousel

LanguageEnglish

PublisherApress

Release dateApr 22, 2021

ISBN9781484268858

Author

Henrietta Dombrovskaya

Related authors

Skip carousel

Related ebooks

Skip carousel

PostgreSQL for Jobseekers: Introduction to PostgreSQL administration for modern DBAs (English Edition)
Ebook
PostgreSQL for Jobseekers: Introduction to PostgreSQL administration for modern DBAs (English Edition)
bySonia Valeja
Rating: 0 out of 5 stars
0 ratings
Everyday Data Structures
Ebook
Everyday Data Structures
byWilliam Smith
Rating: 0 out of 5 stars
0 ratings
Refactoring Legacy T-SQL for Improved Performance: Modern Practices for SQL Server Applications
Ebook
Refactoring Legacy T-SQL for Improved Performance: Modern Practices for SQL Server Applications
byLisa Bohm
Rating: 0 out of 5 stars
0 ratings
Mastering PostgreSQL 12 - Third Edition: Advanced techniques to build and administer scalable and reliable PostgreSQL database applications, 3rd Edition
Ebook
Mastering PostgreSQL 12 - Third Edition: Advanced techniques to build and administer scalable and reliable PostgreSQL database applications, 3rd Edition
byHans-Jürgen Schönig
Rating: 0 out of 5 stars
0 ratings
Database Management for Business Leaders: Building and Using Data Solutions That Work for You
Ebook
Database Management for Business Leaders: Building and Using Data Solutions That Work for You
byLarry Ruddell
Rating: 0 out of 5 stars
0 ratings
Data Analytics with SAS: Explore your data and get actionable insights with the power of SAS (English Edition)
Ebook
Data Analytics with SAS: Explore your data and get actionable insights with the power of SAS (English Edition)
byNishant Sidana
Rating: 0 out of 5 stars
0 ratings
PostgreSQL for Data Architects
Ebook
PostgreSQL for Data Architects
byJayadevan Maymala
Rating: 0 out of 5 stars
0 ratings
Troubleshooting PostgreSQL
Ebook
Troubleshooting PostgreSQL
byHans-Jürgen Schönig
Rating: 5 out of 5 stars
5/5
Learning Elasticsearch 7.x: Index, Analyze, Search and Aggregate Your Data Using Elasticsearch (English Edition)
Ebook
Learning Elasticsearch 7.x: Index, Analyze, Search and Aggregate Your Data Using Elasticsearch (English Edition)
byAnurag Srivastava
Rating: 0 out of 5 stars
0 ratings
Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples
Ebook
Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples
byPrateek Gupta
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Computer Vision with SAS: An Introduction
Ebook
Deep Learning for Computer Vision with SAS: An Introduction
byRobert Blanchard
Rating: 0 out of 5 stars
0 ratings
PostgreSQL Configuration: Best Practices for Performance and Security
Ebook
PostgreSQL Configuration: Best Practices for Performance and Security
byBaji Shaik
Rating: 0 out of 5 stars
0 ratings
SQL Server Query Performance Tuning
Ebook
SQL Server Query Performance Tuning
byGrant Fritchey
Rating: 0 out of 5 stars
0 ratings
The Modern Data Warehouse in Azure: Building with Speed and Agility on Microsoft’s Cloud Platform
Ebook
The Modern Data Warehouse in Azure: Building with Speed and Agility on Microsoft’s Cloud Platform
byMatt How
Rating: 0 out of 5 stars
0 ratings
Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
Ebook
Practical Data Science with Jupyter: Explore Data Cleaning, Pre-processing, Data Wrangling, Feature Engineering and Machine Learning using Python and Jupyter (English Edition)
byPrateek Gupta
Rating: 0 out of 5 stars
0 ratings
Practical hapi: Build Your Own hapi Apps and Learn from Industry Case Studies
Ebook
Practical hapi: Build Your Own hapi Apps and Learn from Industry Case Studies
byKanika Sud
Rating: 0 out of 5 stars
0 ratings
Backend Developer in 30 Days: Acquire Skills on API Designing, Data Management, Application Testing, Deployment, Security and Performance Optimization (English Edition)
Ebook
Backend Developer in 30 Days: Acquire Skills on API Designing, Data Management, Application Testing, Deployment, Security and Performance Optimization (English Edition)
byPedro Marquez-Soto
Rating: 0 out of 5 stars
0 ratings
Mastering PL/SQL Through Illustrations: From Learning Fundamentals to Developing Efficient PL/SQL Blocks (English Edition)
Ebook
Mastering PL/SQL Through Illustrations: From Learning Fundamentals to Developing Efficient PL/SQL Blocks (English Edition)
byB Chandra
Rating: 0 out of 5 stars
0 ratings
Data Analysis and Business Modeling with Excel 2013
Ebook
Data Analysis and Business Modeling with Excel 2013
byDavid Rojas
Rating: 1 out of 5 stars
1/5
Getting Structured Data from the Internet: Running Web Crawlers/Scrapers on a Big Data Production Scale
Ebook
Getting Structured Data from the Internet: Running Web Crawlers/Scrapers on a Big Data Production Scale
byJay M. Patel
Rating: 0 out of 5 stars
0 ratings
Software Development Accelerated Essentials: What You Didn't Know, You Needed to Know
Ebook
Software Development Accelerated Essentials: What You Didn't Know, You Needed to Know
byEd Gomez
Rating: 0 out of 5 stars
0 ratings
Advanced Analytics with Transact-SQL: Exploring Hidden Patterns and Rules in Your Data
Ebook
Advanced Analytics with Transact-SQL: Exploring Hidden Patterns and Rules in Your Data
byDejan Sarka
Rating: 0 out of 5 stars
0 ratings
SQL
Ebook
SQL
byBrandon Cooper
Rating: 0 out of 5 stars
0 ratings
Expert T-SQL Window Functions in SQL Server 2019: The Hidden Secret to Fast Analytic and Reporting Queries
Ebook
Expert T-SQL Window Functions in SQL Server 2019: The Hidden Secret to Fast Analytic and Reporting Queries
byKathi Kellenberger
Rating: 0 out of 5 stars
0 ratings
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Ebook
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
byAlok Kumar
Rating: 0 out of 5 stars
0 ratings
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
Ebook
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
byOlga Maria Stefania Cucaro
Rating: 0 out of 5 stars
0 ratings
Pro .NET Memory Management: For Better Code, Performance, and Scalability
Ebook
Pro .NET Memory Management: For Better Code, Performance, and Scalability
byKonrad Kokosa
Rating: 0 out of 5 stars
0 ratings
Mastering Java for Data Science
Ebook
Mastering Java for Data Science
byAlexey Grigorev
Rating: 5 out of 5 stars
5/5
Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra, MongoDB, CosmosDB, MySQL and PostgreSQL (English Edition)
Ebook
Hands-on Data Virtualization with Polybase: Administer Big Data, SQL Queries and Data Accessibility Across Hadoop, Azure, Spark, Cassandra, MongoDB, CosmosDB, MySQL and PostgreSQL (English Edition)
byPablo Alejandro Echeverria Barrios
Rating: 0 out of 5 stars
0 ratings
Data Structures and Algorithms with Go: Create efficient solutions and optimize your Go coding skills (English Edition)
Ebook
Data Structures and Algorithms with Go: Create efficient solutions and optimize your Go coding skills (English Edition)
byDušan Stojanović
Rating: 0 out of 5 stars
0 ratings

Databases For You

Skip carousel

CompTIA DataSys+ Study Guide: Exam DS0-001
Ebook
CompTIA DataSys+ Study Guide: Exam DS0-001
byMike Chapple
Rating: 0 out of 5 stars
0 ratings
Spring in Action, Sixth Edition
Ebook
Spring in Action, Sixth Edition
byCraig Walls
Rating: 5 out of 5 stars
5/5
COBOL Basic Training Using VSAM, IMS and DB2
Ebook
COBOL Basic Training Using VSAM, IMS and DB2
byRobert Wingate
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Practical Data Analysis
Ebook
Practical Data Analysis
byHector Cuesta
Rating: 4 out of 5 stars
4/5
Business Intelligence Strategy and Big Data Analytics: A General Management Perspective
Ebook
Business Intelligence Strategy and Big Data Analytics: A General Management Perspective
bySteve Williams
Rating: 5 out of 5 stars
5/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
Ebook
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
byAJIT DASH
Rating: 3 out of 5 stars
3/5
HTML, CSS, Bootstrap, Php, Javascript and MySql: All you need to know to create a dynamic site
Ebook
HTML, CSS, Bootstrap, Php, Javascript and MySql: All you need to know to create a dynamic site
byOlga Maria Stefania Cucaro
Rating: 4 out of 5 stars
4/5
COMPUTER SCIENCE FOR ROOKIES
Ebook
COMPUTER SCIENCE FOR ROOKIES
byAngel Bahabwa
Rating: 0 out of 5 stars
0 ratings
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
SQL Clearly Explained
Ebook
SQL Clearly Explained
byJan L. Harrington
Rating: 5 out of 5 stars
5/5
Building a Scalable Data Warehouse with Data Vault 2.0
Ebook
Building a Scalable Data Warehouse with Data Vault 2.0
byDaniel Linstedt
Rating: 4 out of 5 stars
4/5
Serverless Architectures on AWS, Second Edition
Ebook
Serverless Architectures on AWS, Second Edition
byPeter Sbarski
Rating: 5 out of 5 stars
5/5
Data Mining: Concepts and Techniques
Ebook
Data Mining: Concepts and Techniques
byJiawei Han
Rating: 4 out of 5 stars
4/5
Oracle DBA Mentor: Succeeding as an Oracle Database Administrator
Ebook
Oracle DBA Mentor: Succeeding as an Oracle Database Administrator
byBrian Peasland
Rating: 0 out of 5 stars
0 ratings
Access 2019 For Dummies
Ebook
Access 2019 For Dummies
byLaurie A. Ulrich
Rating: 0 out of 5 stars
0 ratings
Relational Database Design and Implementation
Ebook
Relational Database Design and Implementation
byJan L. Harrington
Rating: 5 out of 5 stars
5/5
Learn SQL Server Administration in a Month of Lunches
Ebook
Learn SQL Server Administration in a Month of Lunches
byDon Jones
Rating: 0 out of 5 stars
0 ratings
Blockchain Basics: A Non-Technical Introduction in 25 Steps
Ebook
Blockchain Basics: A Non-Technical Introduction in 25 Steps
byDaniel Drescher
Rating: 5 out of 5 stars
5/5
Getting Started with SQL Server 2014 Administration
Ebook
Getting Started with SQL Server 2014 Administration
byGethyn Ellis
Rating: 0 out of 5 stars
0 ratings
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
Ebook
Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program
byJohn Ladley
Rating: 4 out of 5 stars
4/5
The SQL Workshop: Learn to create, manipulate and secure data and manage relational databases with SQL
Ebook
The SQL Workshop: Learn to create, manipulate and secure data and manage relational databases with SQL
byFrank Solomon
Rating: 0 out of 5 stars
0 ratings
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
Ebook
SQL Programming & Database Management For Absolute Beginners SQL Server, Structured Query Language Fundamentals: "Learn - By Doing" Approach And Master SQL
byWilliam Sullivan
Rating: 5 out of 5 stars
5/5
A Concise Guide to Object Orientated Programming
Ebook
A Concise Guide to Object Orientated Programming
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Access 2010 All-in-One For Dummies
Ebook
Access 2010 All-in-One For Dummies
byAlison Barrows
Rating: 4 out of 5 stars
4/5
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
Ebook
Beginning Microsoft Power BI: A Practical Guide to Self-Service Data Analytics
byDan Clark
Rating: 0 out of 5 stars
0 ratings
Go in Action
Ebook
Go in Action
byErik St. Martin
Rating: 5 out of 5 stars
5/5
Python and SQLite Development
Ebook
Python and SQLite Development
byAgus Kurniawan
Rating: 0 out of 5 stars
0 ratings
The Visual Imperative: Creating a Visual Culture of Data Discovery
Ebook
The Visual Imperative: Creating a Visual Culture of Data Discovery
byLindy Ryan
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

PostgreSQL vs. Oracle Database - Why Open Source Prevails
Podcast episode
PostgreSQL vs. Oracle Database - Why Open Source Prevails
byContinuous improvement
0 ratings
0% found this document useful
Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel: Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. As the sophistication increases, so does the complexity, leading to challenges for user experience. Jignesh Patel has been researching these areas for several years in his work as a professor at Carnegie Mellon University. In this episode he illuminates the landscape of problems that we are faced with and how his research is aimed at helping to solve these problems.
Podcast episode
Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel: Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. As the sophistication increases, so does the complexity, leading to challenges for user experience. Jignesh Patel has been researching these areas for several years in his work as a professor at Carnegie Mellon University. In this episode he illuminates the landscape of problems that we are faced with and how his research is aimed at helping to solve these problems.
byData Engineering Podcast
0 ratings
0% found this document useful
Agile Development for Data Scientists, Part 1: The Good: If you're a data scientist at a firm that does a …
Podcast episode
Agile Development for Data Scientists, Part 1: The Good: If you're a data scientist at a firm that does a …
byLinear Digressions
0 ratings
0% found this document useful
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
Podcast episode
Data Sharing Across Business And Platform Boundaries: Sharing data is a simple concept, but complicated to implement well. There are numerous business rules and regulatory concerns that need to be applied. There are also numerous technical considerations to be made, particularly if the producer and consumer of the data aren't using the same platforms. In this episode Andrew Jefferson explains the complexities of building a robust system for data sharing, the techno-social considerations, and how the Bobsled platform that he is building aims to simplify the process.
byData Engineering Podcast
0 ratings
0% found this document useful
Putting machine learning into a database: Most data scientists bounce back and forth regula…
Podcast episode
Putting machine learning into a database: Most data scientists bounce back and forth regula…
byLinear Digressions
0 ratings
0% found this document useful
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
Podcast episode
Harnessing Generative AI For Creating Educational Content With Illumidesk: Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating educational material, as well as building a data driven experience for learners.
byData Engineering Podcast
0 ratings
0% found this document useful
Ep. 37 - The Rise of the Data Engineer: When Maxime worked at Facebook, his role started evolving. He was developing new skills, new ways of doing things, and new tools. And — more often than not — he was turning his back on traditional methods. He was a pioneer. He was a...
Podcast episode
Ep. 37 - The Rise of the Data Engineer: When Maxime worked at Facebook, his role started evolving. He was developing new skills, new ways of doing things, and new tools. And — more often than not — he was turning his back on traditional methods. He was a pioneer. He was a...
byfreeCodeCamp Podcast
0 ratings
0% found this document useful
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
Podcast episode
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
byData Engineering Podcast
0 ratings
0% found this document useful
ProcurementSoftware.site – The FREE resource for digital procurement
Podcast episode
ProcurementSoftware.site – The FREE resource for digital procurement
byThe Procurement Software Podcast
0 ratings
0% found this document useful
Designing Data Platforms For Fintech Companies: Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector.
Podcast episode
Designing Data Platforms For Fintech Companies: Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector.
byData Engineering Podcast
0 ratings
0% found this document useful
Modern Customer Data Platform Principles: Databases and analytics architectures have gone through several generational shifts. A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).
Podcast episode
Modern Customer Data Platform Principles: Databases and analytics architectures have gone through several generational shifts. A substantial amount of the data that is being managed in these systems is related to customers and their interactions with an organization. In this episode Tasso Argyros, CEO of ActionIQ, gives a summary of the major epochs in database technologies and how he is applying the capabilities of cloud data warehouses to the challenge of building more comprehensive experiences for end-users through a modern customer data platform (CDP).
byData Engineering Podcast
0 ratings
0% found this document useful
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
Podcast episode
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary: Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different technologies and workflows that they focus on. To bring observability to dbt projects the team at Elementary embedded themselves into the workflow. In this episode Maayan Salom explores the approach that she has taken to bring observability, enhanced testing capabilities, and anomaly detection into every step of the dbt developer experience.
byData Engineering Podcast
0 ratings
0% found this document useful
Composable Data Analytics
Podcast episode
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
#08 - Tech stack: Metabase, Superset, Redash, Grafana
Podcast episode
#08 - Tech stack: Metabase, Superset, Redash, Grafana
byTOPP - The Open Podcast Podcast
0 ratings
0% found this document useful
Justin Dux - Be The Match: Sign-up for The Technopath Way Weekly Newsletter here: technopath.ac-page.com/the-technopath-way-sign-up Be The Match How can I check if I’m still registered in the database if I signed up years ago? You can call this number to find out: 1 (800)...
Podcast episode
Justin Dux - Be The Match: Sign-up for The Technopath Way Weekly Newsletter here: technopath.ac-page.com/the-technopath-way-sign-up Be The Match How can I check if I’m still registered in the database if I signed up years ago? You can call this number to find out: 1 (800)...
byThe Technopath Way: Productivity through tech for nonprofits
0 ratings
0% found this document useful
The Future of Data Science Platforms is Accessibility // Skylar Payne // Coffee Session #65
Podcast episode
The Future of Data Science Platforms is Accessibility // Skylar Payne // Coffee Session #65
byMLOps.community
0 ratings
0% found this document useful
MLOps Meetup #24 // How to Become a Better Data Scientist: The Definite Guide // Alexey Grigorev
Podcast episode
MLOps Meetup #24 // How to Become a Better Data Scientist: The Definite Guide // Alexey Grigorev
byMLOps.community
0 ratings
0% found this document useful
WBSP184: Grow Your Business by Learning the Best Practices of Data-Sharing Across Multiple Entities, a Live Interview w/ a Panel of Experts
Podcast episode
WBSP184: Grow Your Business by Learning the Best Practices of Data-Sharing Across Multiple Entities, a Live Interview w/ a Panel of Experts
byWBSRocks: Business Growth with ERP and Digital Transformation
0 ratings
0% found this document useful
Understanding Time-Series Database Patterns
Podcast episode
Understanding Time-Series Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
WBSP470: Grow Your Business by Understanding ERPNext’s Capabilities, an Objective Panel Discussion
Podcast episode
WBSP470: Grow Your Business by Understanding ERPNext’s Capabilities, an Objective Panel Discussion
byWBSRocks: Business Growth with ERP and Digital Transformation
0 ratings
0% found this document useful
66: A guide to data models and dynamic dashboards for marketers
Podcast episode
66: A guide to data models and dynamic dashboards for marketers
byHumans of Martech
0 ratings
0% found this document useful
When You Say Data Scientist Do You Mean Data Engineer? Lessons Learned From Start Up Life // Elizabeth Chabot
Podcast episode
When You Say Data Scientist Do You Mean Data Engineer? Lessons Learned From Start Up Life // Elizabeth Chabot
byMLOps.community
0 ratings
0% found this document useful
How Data Platforms Affect ML & AI // Jake Watson // #207
Podcast episode
How Data Platforms Affect ML & AI // Jake Watson // #207
byMLOps.community
0 ratings
0% found this document useful
5. Rocio Ng - Data science and product management at LinkedIn
Podcast episode
5. Rocio Ng - Data science and product management at LinkedIn
byTowards Data Science
0 ratings
0% found this document useful
Oracle Data Lakehouse: With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and...
Podcast episode
Oracle Data Lakehouse: With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and...
byOracle University Podcast
0 ratings
0% found this document useful
Automating Analytics Teams
Podcast episode
Automating Analytics Teams
byThe Cloudcast
0 ratings
0% found this document useful
Estimating Software Projects, and Why It's Hard: If you’re like most software engineers and, espec…
Podcast episode
Estimating Software Projects, and Why It's Hard: If you’re like most software engineers and, espec…
byLinear Digressions
0 ratings
0% found this document useful
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
Podcast episode
Building An Internal Database As A Service Platform At Cloudflare: Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale.
byData Engineering Podcast
0 ratings
0% found this document useful
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
Podcast episode
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
byData Engineering Podcast
0 ratings
0% found this document useful
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
Podcast episode
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
Getting The edge
The European Business Review
Article
Getting The edge
Feb 25, 2021
7 min read
Buying The Tool
Techfastly
Article
Buying The Tool
Apr 1, 2021
3 min read
Q&A
Rotman Management
Article
Q&A
May 1, 2023
Describe the capability that companies like Netflix, UPS, Amazon and Caesars Entertainment have in common. These are all leading firms in their industries with respect to leveraging analytics as a source of competitive advantage. We now have so much
7 min read
An Expert Speaks Up on What You Should Know About Programming Languages
Entrepreneur
Article
An Expert Speaks Up on What You Should Know About Programming Languages
Oct 1, 2015
1 min read
Mining Actionable Information with Smart Capture
The European Business Review
Article
Mining Actionable Information with Smart Capture
May 22, 2018
4 min read
COMPETITIVE ADVANTAGE THROUGH SOFTWARE: Contrasting Enterprises & Startups
The European Business Review
Article
COMPETITIVE ADVANTAGE THROUGH SOFTWARE: Contrasting Enterprises & Startups
Feb 4, 2019
6 min read
Make AI Work For You
Linux Format
Article
Make AI Work For You
Apr 2, 2024
8 min read
Quantum Leap
Marketing
Article
Quantum Leap
Jul 11, 2019
6 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
Contributing For Non - Coders
Linux Format
Article
Contributing For Non - Coders
Jan 10, 2023
9 min read
Enterprise Soaring Success
Linux Format
Article
Enterprise Soaring Success
Aug 27, 2019
7 min read
Data Fabric
PC Pro Magazine
Article
Data Fabric
Aug 13, 2020
3 min read
MARIADB Optimise And Control Your Databases
Linux Format
Article
MARIADB Optimise And Control Your Databases
Jul 30, 2019
9 min read
How We Tested…
Linux Format
Article
How We Tested…
Aug 24, 2021
To explore the capabilities of these terminal-based browsers, we used them to access popular websites and tried to download data where applicable. We also attempted tasks that you would normally carry out on those sites. Once we had accessed a site,
1 min read
“If ‘Show Password’ Is Enabled, The Feature Sends Your Password To Their Third-party Servers”
PC Pro Magazine
Article
“If ‘Show Password’ Is Enabled, The Feature Sends Your Password To Their Third-party Servers”
Dec 8, 2022
Like most people who write for a living, I lean heavily on my spoil chicken to get me through the day. Sorry, I mean spell checker. It’s not just professional writers, either: spell checkers have become de rigueur for business users and consumers ali
7 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
Data-driven Decision Making That Uses Data, Mind And Heart
The European Business Review
Article
Data-driven Decision Making That Uses Data, Mind And Heart
Jan 31, 2020
14 min read
One-day Projects To Improve Your Business Network
PC Pro Magazine
Article
One-day Projects To Improve Your Business Network
Apr 10, 2022
8 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
The Deep Learning Revolution For Artificial Intelligence
Facility Management
Article
The Deep Learning Revolution For Artificial Intelligence
Mar 28, 2019
3 min read
Will Generative AI Disrupt Your Company And Your need For Workers?
The European Business Review
Article
Will Generative AI Disrupt Your Company And Your need For Workers?
Jul 31, 2023
5 min read
One Tree To Rule Them All
Family Tree
Article
One Tree To Rule Them All
Apr 19, 2022
7 min read
It As The Whipping Boy: Mistakenly Confusing ‘Enterprise It’ With ‘Consumer It’
The European Business Review
Article
It As The Whipping Boy: Mistakenly Confusing ‘Enterprise It’ With ‘Consumer It’
Jul 31, 2020
As users of digital technologies in their personal lives, many executives pine for their internal IT systems to give them a similar experience and to be just like IT is in their daily lives. They point to the simplicity, ease of use and hassle free n
9 min read
Putting Your Words In Order
Writing Magazine
Article
Putting Your Words In Order
Jun 3, 2021
5 min read
The Algorithmic Leader
Rotman Management
Article
The Algorithmic Leader
Jan 1, 2020
9 min read
CalicoPie Family Historian 7
Computeractive
Article
CalicoPie Family Historian 7
Mar 24, 2021
SOFTWARE | £60 from Family Historian Store www.snipca.com/37615 If you’ve ever researched your family tree, you’ll know it’s much harder than the BBC’s celebrity genealogy programme Who Do You Think You Are? makes it appear. You’ll certainly need to
2 min read
Accounting Software – Time To Switch?
PC Pro Magazine
Article
Accounting Software – Time To Switch?
Mar 9, 2023
7 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Office 365 Features For Business
PC Pro Magazine
Article
Office 365 Features For Business
Dec 8, 2022
4 min read

Related categories

Skip carousel

Reviews for PostgreSQL Query Optimization

Rating: 4 out of 5 stars

4/5

1 rating0 reviews

Book preview

PostgreSQL Query Optimization - Henrietta Dombrovskaya

H. Dombrovskaya et al.PostgreSQL Query Optimizationhttps://doi.org/10.1007/978-1-4842-6885-8_1

1. Why Optimize?

Henrietta Dombrovskaya¹ , Boris Novikov² and Anna Bailliekova³

(1)

Braviant Holdings, Chicago, IL, USA

(2)

HSE University, Saint Petersburg, Russia

(3)

Zendesk, Madison, WI, USA

This chapter covers why optimization is such an important part of database development. You will learn the differences between declarative languages, like SQL, and imperative languages, like Java, which may be more familiar, and how these differences affect programming style. We also demonstrate that optimization applies not only to database queries but also to database design and application architecture.

What Do We Mean by Optimization?

In the context of this book, optimization means any transformation that improves system performance. This definition is purposely very generic, since we want to emphasize that optimization is not a separate development phase. Quite often, database developers try to just make it work first and optimize later. We do not think that this approach is productive. Writing a query without having any idea of how long it will take to run creates a problem that could have been avoided altogether by writing it the right way from the start. We hope that by the time you finish this book, you’ll be prepared to optimize in precisely this fashion: as an integrated part of query development.

We will present some specific techniques; however, the most important thing is to understand how a database engine processes a query and how a query planner decides what execution path to choose. When we teach optimization in a classroom setting, we often say, Think like a database! Look at your query from the point of view of a database engine, and imagine what it has to do to execute that query; imagine that you have to do it yourself instead of the database engine doing it for you. By thinking about the scope of work, you can avoid imposing suboptimal execution plans. This is discussed in more detail in subsequent chapters.

If you practice thinking like a database long enough, it will become a natural way of thinking, and you will be able to write queries correctly right away, often without the need for future optimization.

Why It Is Difficult: Imperative and Declarative

Why isn’t it enough to write a SQL statement which returns the correct result? That’s what we expect when we write application code. Why is it different in SQL, and why is it that two queries that yield the same result may drastically differ in execution time? The underlying source of the problem is that SQL is a declarative language . That means that when we write a SQL statement, we describe the result we want to get, but we do not specify how that result should be obtained. By contrast, in an imperative language , we specify what to do to obtain a desired result—that is, the sequence of steps that should be executed.

As discussed in Chapter 2, the database optimizer chooses the best way of doing it. What is best is determined by many different factors, such as storage structures, indexes, and data statistics.

Let’s look at a simple example; consider the queries in Listing 1-1 and Listing 1-2.

SELECT flight_id

,departure_airport

,arrival_airport

FROM flight

WHERE scheduled_arrival BETWEEN

'2020-10-14' AND '2020-10-15';

Listing 1-1

A query selecting flights with the BETWEEN operator.

SELECT flight_id

,departure_airport

,arrival_airport

FROM flight

WHERE scheduled_arrival:: date='2020-10-14';

Listing 1-2

A query selecting flights by casting to date.

These two queries look almost identical and should yield identical results. However, the execution time will be different because the work done by the database engine will be different. In Chapter 5, we will explain why this happens and how to choose the best query from a performance standpoint.

Thinking imperatively is natural for humans. Generally, when we think about accomplishing a task, we think about the steps that we need to take. Similarly, when we think about a complex query, we think about the sequence of conditions we need to apply to achieve the desired result. However, if we force the database engine to follow this sequence strictly, the result might not be optimal.

For example, let’s try to find out how many people with frequent flyer level 4 fly out of Chicago for Independence Day. If at the first step you want to select all frequent flyers with level 4, you may write something like this:

SELECT * FROM frequent_flyer WHERE level =4

Then, you may want to select these people’s account numbers:

SELECT * FROM account WHERE frequent_flyer_id IN (

SELECT frequent_flyer_id FROM frequent_flyer WHERE level =4

)

And then, if you want to find all bookings made by these people, you might write the following:

WITH level4 AS (SELECT * FROM account WHERE

frequent_flyer_id IN (

SELECT frequent_flyer_id FROM frequent_flyer WHERE level =4

)

SELECT * FROM booking WHERE account_id IN

(SELECT account_id FROM level4)

Possibly, next, you want to find which of these bookings are for the flights which originate in Chicago on July 3. If you continue to construct the query in a similar manner, the next step will be the code in Listing 1-3.

WITH bk AS (

WITH level4 AS (SELECT * FROM account WHERE

frequent_flyer_id IN (

SELECT frequent_flyer_id FROM frequent_flyer WHERE level =4

))

SELECT * FROM booking WHERE account_id IN

(SELECT account_id FROM level4

) )

SELECT * FROM bk WHERE bk.booking_id IN

(SELECT booking_id FROM booking_leg WHERE

Leg_num=1 AND is_returning IS false

AND flight_id IN (

SELECT flight_id FROM flight

WHERE

departure_airport IN ('ORD', 'MDW')

AND scheduled_departure:: DATE='2020-07-04')

)

Listing 1-3

Imperatively constructed query

At the end, you may want to calculate the actual number of travelers. This can be achieved with the query in Listing 1-4.

WITH bk_chi AS (

WITH bk AS (

WITH level4 AS (SELECT * FROM account WHERE

frequent_flyer_id IN (

SELECT frequent_flyer_id FROM frequent_flyer WHERE level =4

))

SELECT * FROM booking WHERE account_id IN

(SELECT account_id FROM level4

) )

SELECT * FROM bk WHERE bk.booking_id IN

(SELECT booking_id FROM booking_leg WHERE

Leg_num=1 AND is_returning IS false

AND flight_id IN (

SELECT flight_id FROM flight

WHERE

departure_airport IN ('ORD', 'MDW')

AND scheduled_departure:: DATE='2020-07-04')

))

SELECT count(*) from passenger WHERE booking_id IN (

SELECT booking_id FROM bk_chi)

Listing 1-4

Calculating a total number of passengers

With the query constructed like this, you are not letting the query planner choose the best execution path, because the sequence of actions is hard-coded. Although the preceding statement is written in a declarative language, it is imperative by nature.

Instead, to write a declarative query, simply specify what you need to retrieve from the database, as shown in Listing 1-5.

SELECT count(*) FROM

booking bk

JOIN booking_leg bl ON bk.booking_id=bl.booking_id

JOIN flight f ON f.flight_id=bl.flight_id

JOIN account a ON a.account_id=bk.account_id

JOIN frequent_flyer ff ON ff.frequent_flyer_id=a.frequent_flyer_id

JOIN passenger ps ON ps.booking_id=bk.booking_id

WHERE level=4

AND leg_num=1

AND is_returning IS false

AND departure_airport IN ('ORD', 'MDW')

AND scheduled_departure BETWEEN '2020-07-04'

AND '2020-07-05'

Listing 1-5

Declarative query to calculate the number of passengers

This way, you allow the database to decide which order of operations is best, which may vary depending on the distribution of values in the relevant columns.

You may want to run these queries after all required indexes are built in Chapter 5.

Optimization Goals

So far, we have implied that a performant query is a query which is executed fast. However, that definition is neither precise nor complete. Even if, for a moment, we consider reduction of execution time as the sole goal of optimization, the question remains: what execution time is good enough. For a monthly general ledger of a big corporation, completion within one hour may be an excellent execution time. For a daily marketing analysis, minutes might be great. For an executive dashboard with a dozen reports, refresh within 10 seconds may be the best time we can achieve. For a function called from a web application, even a hundred milliseconds can be alarmingly slow.

In addition, for the same query, execution time may vary at different times of day or with different database loads. In some cases, we might be interested in average execution time. If a system has a hard timeout, we may want to measure performance by capping the maximum execution time. There is also a subjective component in response time measurement. Ultimately, a company is interested in user satisfaction. Most of the time, user satisfaction depends on response time, but it is also a subjective characteristic.

However, beyond execution time, other characteristics may be taken into account. For example, a service provider may be interested in maximizing system throughput. A small startup may be interested in minimizing resource utilization without compromising the system's response time. We know one company which increased the system's main memory to keep the execution time fast. Their goal was to make sure that the whole database could fit into main memory. That worked for a while until the database grew bigger than any main memory configuration available.

How do we define optimization goals? We use the familiar SMART goal framework. SMART goals are

Specific

Measurable

Achievable (attainable)

Result-based (relevant)

Time-bound (time-driven)

Most people know about SMART goals applied to health and fitness, but the same concept is perfectly applicable to query optimization. Examples of SMART goals are presented in Table 1-1.

Table 1-1

SMART goal examples

Optimizing Processes

It is essential to bear in mind that a database does not exist in a vacuum. A database is the foundation for multiple, often independent applications and systems. For any user (external or internal), overall system performance is the one they experience and the one that matters.

At the organization level, the objective is to reach better performance of the whole system. It might be response time or throughput (essential for the service provider) or (most likely) a balance of both. Nobody is interested in database optimizations that have no impact on overall performance.

Database developers and DBAs often tend to over-optimize any bad query that comes to their attention, just because it is bad. At the same time, their work is often isolated from both application development and business analytics. This is one reason optimization efforts may appear to be less productive than they could be. A SQL query cannot be optimized in isolation, outside the context of its purpose and the environment in which it is executed.

Since queries might not be written declaratively, the original purpose of a query might not be evident. Finding out the business intent of what is to be done might be the first and the most critical optimization step. Moreover, questions about the purpose of a report might lead to the conclusion that it is not needed at all. In one case, questioning the purpose of the most long-running reports allowed us to cut the total traffic on the reporting server by 40%.

Optimizing OLTP and OLAP

There are many ways to classify databases , and different database classes may differ in both performance criteria and optimization techniques. Two major classes are OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) . OLTP databases support applications, and OLAP databases support BI and reporting. Through the course of this book, we will emphasize different approaches to OLTP and OLAP optimization. We will introduce the concepts of short queries and long queries and explain how to distinguish one from the other.

Hint

It does not depend on the length of the SQL statement.

In the majority of cases, in OLTP systems we are optimizing short queries and in OLAP systems both short and long queries.

Database Design and Performance

We have already mentioned that we do not like the concept of first write and then optimize and that this book's goal is to help you write queries right right away. When should a developer start thinking about performance of the query they are working on? The answer is the sooner, the better. Ideally, optimization starts from requirements. In practice, this is not always the case, although gathering requirements is essential.

To be more precise, gathering requirements allows us to come up with the best database design, and database design can impact performance.

If you are a DBA, chances are, from time to time, you get requests to review new tables and views, which means you need to evaluate someone else’s database design. If you do not have any exposure to what a new project is about and the purpose of the new tables and views, there is not much you can do to determine whether the proposed design is optimal. The only thing you may be able to evaluate without going into the details of the business requirements is whether the database design is normalized. Even that might not be obvious without knowing the business specifics.

The only way to evaluate a proposed database design is to ask the right questions. The right questions include questions about what real-life objects the tables represent. Thus, optimization starts with gathering requirements. To illustrate that statement, let’s look at the following example: in this database, we need to store user accounts, and we need to store each account holder’s phone number(s). Two possible designs are shown in Figures 1-1 and 1-2, respectively.

../images/501585_1_En_1_Chapter/501585_1_En_1_Fig1_HTML.jpg

Figure 1-1

Single-table design

../images/501585_1_En_1_Chapter/501585_1_En_1_Fig2_HTML.jpg

Figure 1-2

Two-table design

Which of the two designs is the right one? It depends on the intended usage of the data. If phone numbers are never used as search criteria and are selected as a part of an account (to be displayed on the customer support screen), if UX has fields labeled with specific phone types, then a single-table design is more appropriate.

However, if we want to search by phone number regardless of type, having all phones in a separate table will make the search more performant.

Also, users are often asked to indicate which phone number is their primary phone. It is easy to add one Boolean attribute is_primary to the two-table design, but it will be more complicated in the one-table design. An additional complication might arise when somebody does not have a landline or a work phone, which happens often. On the other hand, people often have more than one cell phone, or they might have a virtual number, like Google Voice, and they might want to record that number as the primary number to reach them. All these considerations are in favor of the two-table design.

Lastly, we can evaluate the frequency of each use case and how critical response time is in each case.

Application Development and Performance

We are talking about application development , not just the database side of development because once again, database queries are not executed by themselves—they are parts of applications. Traditionally, optimizing the individual queries is viewed as optimization, but we are going to take a broader approach.

Quite often, although each database query executed by an application returns results in less than 0.1 seconds, an application page response time may amount to 10 seconds or more. Technically speaking, optimization of such processes is not a database optimization in its traditional meaning, but there is a lot a database developer can do to improve the situation. We cover a relevant optimization technique in Chapters 10 and 13.

Other Stages of the Lifecycle

The life of an application does not end after release in production, and the optimization is a continuous process as well. Although our goal should be to optimize long-term, it is hard to predict how exactly the system will evolve. It is a good practice to continually keep an eye on the system performance, not only on the execution times but on trends.

A query may be very performant, and one might not notice that the execution time started to increase because it is still within acceptable limits, and no automated monitoring system will be alerted.

Query execution time may change because data volume increased or the data distribution changed or execution frequency increased. In addition, we expect new indexes and other improvements in each new PostgreSQL release, and some of them may be so significant that they prompt rewriting original queries.

Whatever the cause of the change is, no part of any system should be assumed to be optimized forever.

PostgreSQL Specifics

Although the principles described in the previous section apply to any relational database, PostgreSQL, like any other database, has some specifics that should be considered. If you have some previous experience in optimizing other databases, you might find a good portion of your knowledge does not apply. Do not consider this a PostgreSQL deficiency; just remember that PostgreSQL does lots of things differently.

Perhaps the most important feature you should be aware of is that PostgreSQL does not have optimizer hints. If you previously worked with a database like Oracle, which does have the option of hinting to the optimizer, you might feel helpless when you are presented with the challenge of optimizing a PostgreSQL query. However, here is some good news: PostgreSQL does not have hints by design. The PostgreSQL core team believes in investing in developing a query planner which is capable of choosing the best execution path without hints. As a result, the PostgreSQL optimization engine is one of the best among both commercial and open source systems. Many strong database internal developers have been drawn to Postgres because of the optimizer. In addition, Postgres has been chosen as the founding source code base for several commercial databases partly because of the optimizer. With PostgreSQL, it is even more important to write your SQL statements declaratively, allowing the optimizer to do its job.

Another PostgreSQL feature you should be aware of is the difference between the execution of parameterized queries and dynamic SQL. Chapter 12 of this book is dedicated to the use of dynamic SQL, an option which is often overlooked.

With PostgreSQL, it is especially important to be aware of new features and capabilities added with each release. In recent years, Postgres has had over 180 of them each year. Many of these features are around optimization. We are not planning to cover them all; moreover, between the writing of this chapter and its publication, there will indubitably be more. PostgreSQL has an incredibly rich set of types and indexes, and it is always worth consulting recent documentation to check whether a feature you wanted might have been implemented.

More PostgreSQL specifics will be addressed later in the book.

Summary

Writing a database query is different from writing application code using imperative languages. SQL is a declarative language, which means that we specify the desired outcome, but do not specify an execution path. Since two queries yielding the same result may be executed differently, utilizing different resources and taking a different amount of time, optimization and thinking like a database are core parts of SQL development.

Instead of optimizing queries that are already written, our goal is to write queries correctly from the start. Ideally, optimization begins at the time of gathering requirements and designing the database. Then, we can proceed with optimizing both individual queries and the way the database calls from the application are structured. But optimization does not end there; in order to keep the system performant, we need to monitor performance throughout the system lifecycle.

H. Dombrovskaya et al.PostgreSQL Query Optimizationhttps://doi.org/10.1007/978-1-4842-6885-8_2

2. Theory: Yes, We Need It!

Henrietta Dombrovskaya¹ , Boris Novikov² and Anna Bailliekova³

(1)

Braviant Holdings, Chicago, IL, USA

(2)

HSE University, Saint Petersburg, Russia

(3)

Zendesk, Madison, WI, USA

In order to write performant queries, a database developer needs to understand how queries are processed by a database engine. And to do that, we need to know the basics of relational theory. If the word theory sounds too dry, we can call it the secret life of a database query. In this chapter, we will take a look at this secret life, explaining what happens to a database query between the moment you click Execute or press Enter and the moment you see the result set returned from the database.

As discussed in the last chapter, a SQL query specifies what results are needed or what must be changed in the database but does not specify how exactly the expected results should be achieved. It is the job of the database engine to convert the source SQL query into executable code and execute it. This chapter covers the operations used by the database engine as it interprets a SQL query and their theoretical underpinning.

Query Processing Overview

In order to produce query results, PostgreSQL performs the following steps:

Compile and transform a SQL statement into an expression consisting of high-level logical operations, known as a logical plan.

Optimize the logical plan and convert it into an execution plan.

Execute (interpret) the plan and return results.

Compilation

Compiling a SQL query is similar to compiling code written in an imperative language. The source code is parsed, and an internal representation is generated. However, the compilation of SQL statements has two essential differences.

First, in an imperative language, the definitions of identifiers are usually included in the source code, while definitions of objects referenced in SQL queries are mostly stored in the database. Consequently, the meaning of a query depends on the database structure: different database servers can interpret the same query

Enjoying the preview?

Page 1 of 1

PostgreSQL Query Optimization: The Ultimate Guide to Building Efficient Queries

About this ebook

Henrietta Dombrovskaya

Related authors

Related to PostgreSQL Query Optimization

Related ebooks

Databases For You

Related podcast episodes

Related articles

Related categories

Reviews for PostgreSQL Query Optimization

What did you think?

Book preview

PostgreSQL Query Optimization - Henrietta Dombrovskaya

1. Why Optimize?

What Do We Mean by Optimization?

Why It Is Difficult: Imperative and Declarative

Optimization Goals

Optimizing Processes

Optimizing OLTP and OLAP

Database Design and Performance

Application Development and Performance

Other Stages of the Lifecycle

PostgreSQL Specifics

Summary

2. Theory: Yes, We Need It!

Query Processing Overview

Compilation