Ebook965 pages9 hours

Real World Multicore Embedded Systems

Name: Real World Multicore Embedded Systems
Brand: Elsevier Science
Rating: 3.0 (1 reviews)

By Bryon Moyer

Rating: 3 out of 5 stars

3/5

()

Read preview

About this ebook

This Expert Guide gives you the techniques and technologies in embedded multicore to optimally design and implement your embedded system. Written by experts with a solutions focus, this encyclopedic reference gives you an indispensable aid to tackling the day-to-day problems when building and managing multicore embedded systems.

Following an embedded system design path from start to finish, our team of experts takes you from architecture, through hardware implementation to software programming and debug.

With this book you will learn:

• What motivates multicore

• The architectural options and tradeoffs; when to use what

• How to deal with the unique hardware challenges that multicore presents

• How to manage the software infrastructure in a multicore environment

• How to write effective multicore programs

• How to port legacy code into a multicore system and partition legacy software

• How to optimize both the system and software

• The particular challenges of debugging multicore hardware and software

Examples demonstrating timeless implementation details
Proven and practical techniques reflecting the authors’ expertise built from years of experience and key advice on tackling critical issues

Skip carousel

LanguageEnglish

PublisherElsevier Science

Release dateFeb 27, 2013

ISBN9780123914613

Related to Real World Multicore Embedded Systems

Related ebooks

Skip carousel

Design Patterns for Embedded Systems in C: An Embedded Software Engineering Toolkit
Ebook
Design Patterns for Embedded Systems in C: An Embedded Software Engineering Toolkit
byBruce Powel Douglass
Rating: 5 out of 5 stars
5/5
Demystifying Embedded Systems Middleware
Ebook
Demystifying Embedded Systems Middleware
byTammy Noergaard
Rating: 4 out of 5 stars
4/5
Embedded Computing for High Performance: Efficient Mapping of Computations Using Customization, Code Transformations and Compilation
Ebook
Embedded Computing for High Performance: Efficient Mapping of Computations Using Customization, Code Transformations and Compilation
byJoão Manuel Paiva Cardoso
Rating: 4 out of 5 stars
4/5
Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications
Ebook
Software Engineering for Embedded Systems: Methods, Practical Techniques, and Applications
byRobert Oshana
Rating: 3 out of 5 stars
3/5
Real-Time Embedded Systems: Design Principles and Engineering Practices
Ebook
Real-Time Embedded Systems: Design Principles and Engineering Practices
byXiaocong Fan
Rating: 4 out of 5 stars
4/5
Embedded Systems Architecture: A Comprehensive Guide for Engineers and Programmers
Ebook
Embedded Systems Architecture: A Comprehensive Guide for Engineers and Programmers
byTammy Noergaard
Rating: 5 out of 5 stars
5/5
DSP for Embedded and Real-Time Systems
Ebook
DSP for Embedded and Real-Time Systems
byRobert Oshana
Rating: 5 out of 5 stars
5/5
ARM® Cortex® M4 Cookbook
Ebook
ARM® Cortex® M4 Cookbook
byFisher Dr. Mark
Rating: 4 out of 5 stars
4/5
The Designer's Guide to the Cortex-M Processor Family: A Tutorial Approach
Ebook
The Designer's Guide to the Cortex-M Processor Family: A Tutorial Approach
byTrevor Martin
Rating: 5 out of 5 stars
5/5
Debugging Embedded and Real-Time Systems: The Art, Science, Technology, and Tools of Real-Time System Debugging
Ebook
Debugging Embedded and Real-Time Systems: The Art, Science, Technology, and Tools of Real-Time System Debugging
byArnold S. Berger
Rating: 5 out of 5 stars
5/5
Embedded RTOS Design: Insights and Implementation
Ebook
Embedded RTOS Design: Insights and Implementation
byColin Walls
Rating: 0 out of 5 stars
0 ratings
Embedded Systems Security: Practical Methods for Safe and Secure Software and Systems Development
Ebook
Embedded Systems Security: Practical Methods for Safe and Secure Software and Systems Development
byDavid Kleidermacher
Rating: 5 out of 5 stars
5/5
Embedded Systems Design with Platform FPGAs: Principles and Practices
Ebook
Embedded Systems Design with Platform FPGAs: Principles and Practices
byRonald Sass
Rating: 5 out of 5 stars
5/5
The Art of Assembly Language Programming Using PIC® Technology: Core Fundamentals
Ebook
The Art of Assembly Language Programming Using PIC® Technology: Core Fundamentals
byTheresa Schousek
Rating: 0 out of 5 stars
0 ratings
Model-Based Engineering for Complex Electronic Systems
Ebook
Model-Based Engineering for Complex Electronic Systems
byPeter Wilson
Rating: 5 out of 5 stars
5/5
System on Chip Interfaces for Low Power Design
Ebook
System on Chip Interfaces for Low Power Design
bySanjeeb Mishra
Rating: 0 out of 5 stars
0 ratings
Heterogeneous System Architecture: A New Compute Platform Infrastructure
Ebook
Heterogeneous System Architecture: A New Compute Platform Infrastructure
byWen-mei W. Hwu
Rating: 0 out of 5 stars
0 ratings
Modeling and Analysis of Real-Time and Embedded Systems with UML and MARTE: Developing Cyber-Physical Systems
Ebook
Modeling and Analysis of Real-Time and Embedded Systems with UML and MARTE: Developing Cyber-Physical Systems
byBran Selic
Rating: 5 out of 5 stars
5/5
Digital Integrated Circuit Design Using Verilog and Systemverilog
Ebook
Digital Integrated Circuit Design Using Verilog and Systemverilog
byRonald W. Mehler
Rating: 3 out of 5 stars
3/5
Practical Design and Application of Model Predictive Control: MPC for MATLAB® and Simulink® Users
Ebook
Practical Design and Application of Model Predictive Control: MPC for MATLAB® and Simulink® Users
byNassim Khaled
Rating: 3 out of 5 stars
3/5
Embedded Systems A Complete Guide - 2021 Edition
Ebook
Embedded Systems A Complete Guide - 2021 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Embedded Multitasking
Ebook
Embedded Multitasking
byKeith E. Curtis
Rating: 0 out of 5 stars
0 ratings
The Art of Designing Embedded Systems
Ebook
The Art of Designing Embedded Systems
byJack Ganssle
Rating: 4 out of 5 stars
4/5
Designing Embedded Systems with 32-Bit PIC Microcontrollers and MikroC
Ebook
Designing Embedded Systems with 32-Bit PIC Microcontrollers and MikroC
byDogan Ibrahim
Rating: 5 out of 5 stars
5/5
ARM System Developer's Guide: Designing and Optimizing System Software
Ebook
ARM System Developer's Guide: Designing and Optimizing System Software
byAndrew Sloss
Rating: 4 out of 5 stars
4/5
So You Wanna Be an Embedded Engineer: The Guide to Embedded Engineering, From Consultancy to the Corporate Ladder
Ebook
So You Wanna Be an Embedded Engineer: The Guide to Embedded Engineering, From Consultancy to the Corporate Ladder
byLewin Edwards
Rating: 4 out of 5 stars
4/5
Hardware/Firmware Interface Design: Best Practices for Improving Embedded Systems Development
Ebook
Hardware/Firmware Interface Design: Best Practices for Improving Embedded Systems Development
byGary Stringham
Rating: 5 out of 5 stars
5/5
The Art of Programming Embedded Systems
Ebook
The Art of Programming Embedded Systems
byJack Ganssle
Rating: 3 out of 5 stars
3/5
Embedded Systems: World Class Designs
Ebook
Embedded Systems: World Class Designs
byJack Ganssle
Rating: 5 out of 5 stars
5/5
Embedded Hardware: Know It All
Ebook
Embedded Hardware: Know It All
byJack Ganssle
Rating: 5 out of 5 stars
5/5

Hardware For You

Skip carousel

Mastering ChatGPT
Ebook
Mastering ChatGPT
byCharles J. Jones
Rating: 0 out of 5 stars
0 ratings
CompTIA A+ Complete Review Guide: Exam Core 1 220-1001 and Exam Core 2 220-1002
Ebook
CompTIA A+ Complete Review Guide: Exam Core 1 220-1001 and Exam Core 2 220-1002
byTroy McMillan
Rating: 5 out of 5 stars
5/5
Fitbit For Dummies
Ebook
Fitbit For Dummies
byPaul McFedries
Rating: 0 out of 5 stars
0 ratings
iPhone Photography: A Ridiculously Simple Guide To Taking Photos With Your iPhone
Ebook
iPhone Photography: A Ridiculously Simple Guide To Taking Photos With Your iPhone
byScott La Counte
Rating: 0 out of 5 stars
0 ratings
50 Android Hacks
Ebook
50 Android Hacks
byCarlos Sessa
Rating: 5 out of 5 stars
5/5
Computer Science: A Concise Introduction
Ebook
Computer Science: A Concise Introduction
byIan Sinclair
Rating: 4 out of 5 stars
4/5
iPhone 14 Pro Max User Guide for Beginners and Seniors
Ebook
iPhone 14 Pro Max User Guide for Beginners and Seniors
byCharles J. Jones
Rating: 0 out of 5 stars
0 ratings
Dancing with Qubits: How quantum computing works and how it can change the world
Ebook
Dancing with Qubits: How quantum computing works and how it can change the world
byRobert S. Sutor
Rating: 5 out of 5 stars
5/5
Build Your Own PC Do-It-Yourself For Dummies
Ebook
Build Your Own PC Do-It-Yourself For Dummies
byMark L. Chambers
Rating: 4 out of 5 stars
4/5
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
Ebook
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
byJoseph Kenna
Rating: 0 out of 5 stars
0 ratings
Upgrading and Fixing Computers Do-it-Yourself For Dummies
Ebook
Upgrading and Fixing Computers Do-it-Yourself For Dummies
byAndy Rathbone
Rating: 4 out of 5 stars
4/5
Samsung Galaxy S23 Ultra User Guide for Beginners and Seniors
Ebook
Samsung Galaxy S23 Ultra User Guide for Beginners and Seniors
byCharles J. Jones
Rating: 3 out of 5 stars
3/5
CompTIA A+ Complete Review Guide: Core 1 Exam 220-1101 and Core 2 Exam 220-1102
Ebook
CompTIA A+ Complete Review Guide: Core 1 Exam 220-1101 and Core 2 Exam 220-1102
byTroy McMillan
Rating: 5 out of 5 stars
5/5
Exploring Apple iPad: iPadOS 15 Edition: The Illustrated, Practical Guide to Using your iPad
Ebook
Exploring Apple iPad: iPadOS 15 Edition: The Illustrated, Practical Guide to Using your iPad
byKevin Wilson
Rating: 0 out of 5 stars
0 ratings
Embedded Systems: World Class Designs
Ebook
Embedded Systems: World Class Designs
byJack Ganssle
Rating: 5 out of 5 stars
5/5
Raspberry Pi Electronics Projects for the Evil Genius
Ebook
Raspberry Pi Electronics Projects for the Evil Genius
byDonald Norris
Rating: 3 out of 5 stars
3/5
Windows 11 For Seniors For Dummies
Ebook
Windows 11 For Seniors For Dummies
byCurt Simmons
Rating: 0 out of 5 stars
0 ratings
iPod and iTunes For Dummies
Ebook
iPod and iTunes For Dummies
byTony Bove
Rating: 4 out of 5 stars
4/5
Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs
Ebook
Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs
byKen Kocienda
Rating: 5 out of 5 stars
5/5
iPhone For Seniors For Dummies: Updated for iPhone 12 models and iOS 14
Ebook
iPhone For Seniors For Dummies: Updated for iPhone 12 models and iOS 14
byDwight Spivey
Rating: 4 out of 5 stars
4/5
iPhone X Hacks, Tips and Tricks: Discover 101 Awesome Tips and Tricks for iPhone XS, XS Max and iPhone X
Ebook
iPhone X Hacks, Tips and Tricks: Discover 101 Awesome Tips and Tricks for iPhone XS, XS Max and iPhone X
byDavid Cromwell
Rating: 3 out of 5 stars
3/5
So you want to build a computer...
Ebook
So you want to build a computer...
byJ. P. Kurzitza
Rating: 5 out of 5 stars
5/5
Computers For Seniors For Dummies
Ebook
Computers For Seniors For Dummies
byFaithe Wempen
Rating: 0 out of 5 stars
0 ratings
Linux All-in-One For Dummies
Ebook
Linux All-in-One For Dummies
byEmmett Dulaney
Rating: 3 out of 5 stars
3/5
iPhone 12, iPhone Pro, and iPhone Pro Max For Senirs: A Ridiculously Simple Guide to the Next Generation of iPhone and iOS 14
Ebook
iPhone 12, iPhone Pro, and iPhone Pro Max For Senirs: A Ridiculously Simple Guide to the Next Generation of iPhone and iOS 14
byScott La Counte
Rating: 0 out of 5 stars
0 ratings
Computer Organization and Design: The Hardware / Software Interface
Ebook
Computer Organization and Design: The Hardware / Software Interface
byJohn L. Hennessy
Rating: 4 out of 5 stars
4/5
Programming Arduino: Getting Started with Sketches
Ebook
Programming Arduino: Getting Started with Sketches
bySimon Monk
Rating: 4 out of 5 stars
4/5
Macs All-in-One For Dummies
Ebook
Macs All-in-One For Dummies
byPaul McFedries
Rating: 0 out of 5 stars
0 ratings
Chip War: The Fight for the World's Most Critical Technology
Ebook
Chip War: The Fight for the World's Most Critical Technology
byChris Miller
Rating: 4 out of 5 stars
4/5
Exploring Apple Mac - Ventura Edition: The Illustrated, Practical Guide to Using MacOS
Ebook
Exploring Apple Mac - Ventura Edition: The Illustrated, Practical Guide to Using MacOS
byKevin Wilson
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

41. Bob Nystrom
Podcast episode
41. Bob Nystrom
byIt's All Widgets! Flutter Podcast
0 ratings
0% found this document useful
Analog Computing and the Computer of the Tides with Charles Petzold: Charles Petzold taught many of us to code Windows, but now he's turning his attention to a new book he's been working on for over a decade! This week Scott talks to Charles about Analog Computing and the Computer of the Tides.
Podcast episode
Analog Computing and the Computer of the Tides with Charles Petzold: Charles Petzold taught many of us to code Windows, but now he's turning his attention to a new book he's been working on for over a decade! This week Scott talks to Charles about Analog Computing and the Computer of the Tides.
byHanselminutes with Scott Hanselman
100%
100% found this document useful
#98 Interpretable Machine Learning
Podcast episode
#98 Interpretable Machine Learning
byDataFramed
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
JSJ 270 The Complete Software Developers Career Guide with John Sonmez
Podcast episode
JSJ 270 The Complete Software Developers Career Guide with John Sonmez
byJavaScript Jabber
0 ratings
0% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
Podcast episode
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
2: Pytest vs Unittest vs Nose: Choosing a test framework
Podcast episode
2: Pytest vs Unittest vs Nose: Choosing a test framework
byTest and Code
0 ratings
0% found this document useful
Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch: An interview with Andrew Ferlitsch about his experiences building and teaching deep learning models and his work on a book to capture those lessons for everyone to learn from.
Podcast episode
Exploring The Patterns And Practices For Deep Learning With Andrew Ferlitsch: An interview with Andrew Ferlitsch about his experiences building and teaching deep learning models and his work on a book to capture those lessons for everyone to learn from.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
Podcast episode
MLA 018 Descript: (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it...
byMachine Learning Guide
0 ratings
0% found this document useful
Trends in NLP with John Bohannon - #550
Podcast episode
Trends in NLP with John Bohannon - #550
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
TCP & UDP: with Adam Woodbeck
Podcast episode
TCP & UDP: with Adam Woodbeck
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
The Role of Infrastructure in ML // Niels Bantilan // #197
Podcast episode
The Role of Infrastructure in ML // Niels Bantilan // #197
byMLOps.community
0 ratings
0% found this document useful
What Evolutionary Biology Can Tell Us About Software Development - Part 1: Etienne De Bruin, Aaron Longwell, Scott Graves and Judah McAuley discuss what engineers can learn from evolutionary biology when it comes to the software development process
Podcast episode
What Evolutionary Biology Can Tell Us About Software Development - Part 1: Etienne De Bruin, Aaron Longwell, Scott Graves and Judah McAuley discuss what engineers can learn from evolutionary biology when it comes to the software development process
byCTO Podcast
0 ratings
0% found this document useful
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling: For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.
Podcast episode
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling: For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.
byData Engineering Podcast
0 ratings
0% found this document useful
Cost/Performance Optimization with LLMs [Panel]
Podcast episode
Cost/Performance Optimization with LLMs [Panel]
byMLOps.community
0 ratings
0% found this document useful
381 Programming Framework: Which Ones To Learn? - Simple Programmer Podcast: If you're a software developer I doubt you'll ever be able to learn everything that software developer has to offer. Every day new programming languages come out, technology changes and the process is updated. All this amount of information makes it...
Podcast episode
381 Programming Framework: Which Ones To Learn? - Simple Programmer Podcast: If you're a software developer I doubt you'll ever be able to learn everything that software developer has to offer. Every day new programming languages come out, technology changes and the process is updated. All this amount of information makes it...
bySimple Programmer Podcast
0 ratings
0% found this document useful
Ep. 33 - Code dependencies are the devil: Have you built your app on someone else's code? And beyond that, does the "secret sauce" of your product depend on external libraries or frameworks? While it's tempting to use the latest and greatest tech as soon as it comes out, that's not always a...
Podcast episode
Ep. 33 - Code dependencies are the devil: Have you built your app on someone else's code? And beyond that, does the "secret sauce" of your product depend on external libraries or frameworks? While it's tempting to use the latest and greatest tech as soon as it comes out, that's not always a...
byfreeCodeCamp Podcast
0 ratings
0% found this document useful
Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI: The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects. In this episode Adam Kamor explores the factors that make this such a complex problem to solve, the approach that he and his team have taken to turn it into a reliable product, and how you can start using it to replace your own collection of scripts.
Podcast episode
Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI: The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects. In this episode Adam Kamor explores the factors that make this such a complex problem to solve, the approach that he and his team have taken to turn it into a reliable product, and how you can start using it to replace your own collection of scripts.
byData Engineering Podcast
0 ratings
0% found this document useful
Selenium Insight from All-Star SeleniumConf Speakers!: In this episode, you'll hear from five SeleniumConf Chicago 2023 speakers and/or project core committers about their upcoming talks, the reasons for their participation, and the benefits attendees can expect to gain from the conference. Additionally,...
Podcast episode
Selenium Insight from All-Star SeleniumConf Speakers!: In this episode, you'll hear from five SeleniumConf Chicago 2023 speakers and/or project core committers about their upcoming talks, the reasons for their participation, and the benefits attendees can expect to gain from the conference. Additionally,...
byTestGuild Automation Podcast
0 ratings
0% found this document useful
[Exclusive] Databricks Roundtable // Introducing DBRX: The Future of Language Models
Podcast episode
[Exclusive] Databricks Roundtable // Introducing DBRX: The Future of Language Models
byMLOps.community
0 ratings
0% found this document useful
Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality with Dr. Gail Murphy: In this newest episode of The Idealcast, Gene Kim speaks with Dr. Gail Murphy, Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. She is also the co-founder, board member, and former Chi...
Podcast episode
Open Source Software as a Triumph of Information Hiding, Modularity, and Creating Optionality with Dr. Gail Murphy: In this newest episode of The Idealcast, Gene Kim speaks with Dr. Gail Murphy, Professor of Computer Science and Vice President of Research and Innovation at the University of British Columbia. She is also the co-founder, board member, and former Chi...
byThe Idealcast with Gene Kim by IT Revolution
0 ratings
0% found this document useful
Chelsea Troy - All Code Has Maintenance Load: Robby has a chat with Chelsea Troy, the Staff Software Engineer on machine learning and backend systems at Mozilla. Chelsea will impart her valuable knowledge on how we can go about making software more maintainable, strategies we can use to quantify maintenance work, documenting code with more helpful error messages that provide more context, and so much more.
Podcast episode
Chelsea Troy - All Code Has Maintenance Load: Robby has a chat with Chelsea Troy, the Staff Software Engineer on machine learning and backend systems at Mozilla. Chelsea will impart her valuable knowledge on how we can go about making software more maintainable, strategies we can use to quantify maintenance work, documenting code with more helpful error messages that provide more context, and so much more.
byMaintainable
0 ratings
0% found this document useful
Cameron Jacoby - Am I Learning From This?: Robby has a chat with Cameron Jacoby (she/her/hers), the Senior Full-Stack Engineer at BetterUp, about software engineering topics such as why procedural code can often be easier to communicate with, the importance of having helpful data metrics for most new features one works on, real-world approaches to tracking metrics for monitoring purposes, and the benefits of using feature flags, especially within internal-facing software applications.
Podcast episode
Cameron Jacoby - Am I Learning From This?: Robby has a chat with Cameron Jacoby (she/her/hers), the Senior Full-Stack Engineer at BetterUp, about software engineering topics such as why procedural code can often be easier to communicate with, the importance of having helpful data metrics for most new features one works on, real-world approaches to tracking metrics for monitoring purposes, and the benefits of using feature flags, especially within internal-facing software applications.
byMaintainable
0 ratings
0% found this document useful
Putting the “Fun” in Functional with Frank Chen: Almost everyone is using Slack, and a lot of that is because of the work of those like Frank Chen, Slack’s Senior Staff Software Engineer. Frank is here to tell us how Slack keeps us all angrily typing. But equally as important is his own trajectory which
Podcast episode
Putting the “Fun” in Functional with Frank Chen: Almost everyone is using Slack, and a lot of that is because of the work of those like Frank Chen, Slack’s Senior Staff Software Engineer. Frank is here to tell us how Slack keeps us all angrily typing. But equally as important is his own trajectory which
byScreaming in the Cloud
0 ratings
0% found this document useful
Aligning Data Security With Business Productivity To Deploy Analytics Safely And At Speed: As with all aspects of technology, security is a critical element of data applications, and the different controls can be at cross purposes with productivity. In this episode Yoav Cohen from Satori shares his experiences as a practitioner in the space of data security and how to align with the needs of engineers and business users. He also explains why data security is distinct from application security and some methods for reducing the challenge of working across different data systems.
Podcast episode
Aligning Data Security With Business Productivity To Deploy Analytics Safely And At Speed: As with all aspects of technology, security is a critical element of data applications, and the different controls can be at cross purposes with productivity. In this episode Yoav Cohen from Satori shares his experiences as a practitioner in the space of data security and how to align with the needs of engineers and business users. He also explains why data security is distinct from application security and some methods for reducing the challenge of working across different data systems.
byData Engineering Podcast
0 ratings
0% found this document useful
2015 - DevOpsDays Pittsburgh - The Incredible Shrinking Operating System!: Steve Jones
Podcast episode
2015 - DevOpsDays Pittsburgh - The Incredible Shrinking Operating System!: Steve Jones
byDevOps Days Podcast
0 ratings
0% found this document useful
Embedded Systems in Elixir vs. C, C++, and Java with Connor Rigby & Taylor Barto: Connor Rigby, Software Engineer at SmartRent, and Taylor Barto, Lead Embedded Software Engineer at Eaton, join Sundi to compare notes on embedded systems development with Elixir, C, C++, and Java. The guests ask one another questions to gain valuable insights into challenges, tooling, resources, and more across different embedded ecosystems.
Podcast episode
Embedded Systems in Elixir vs. C, C++, and Java with Connor Rigby & Taylor Barto: Connor Rigby, Software Engineer at SmartRent, and Taylor Barto, Lead Embedded Software Engineer at Eaton, join Sundi to compare notes on embedded systems development with Elixir, C, C++, and Java. The guests ask one another questions to gain valuable insights into challenges, tooling, resources, and more across different embedded ecosystems.
byElixir Wizards
0 ratings
0% found this document useful
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
Podcast episode
Making Email Better With AI At Shortwave: Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his team have overcome in integrating AI into their product, as well as the benefits and features that it provides to their customers.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

Control Real-world Hardware On Your PC
Linux Format
Article
Control Real-world Hardware On Your PC
Mar 9, 2021
10 min read
Raspberry Pi 4 B
APC
Article
Raspberry Pi 4 B
Aug 12, 2019
5 min read
The Coming Software Apocalypse
The Atlantic
Article
The Coming Software Apocalypse
Sep 26, 2017
33 min read
Active Audio Filter Design
CQ Amateur Radio
Article
Active Audio Filter Design
Jul 1, 2019
4 min read
Math’s Notes
CQ Amateur Radio
Article
Math’s Notes
Aug 1, 2020
4 min read
New Tricks For The Pico Voltmeter
Linux Format
Article
New Tricks For The Pico Voltmeter
Apr 6, 2021
7 min read
EMBEDDED COMPUTING Getting Hands-on With Embedded Hardware
Linux Format
Article
EMBEDDED COMPUTING Getting Hands-on With Embedded Hardware
Jun 4, 2019
Single board computers (SBCs), typically credit-card sized or smaller, are now a familiar part of the computing scene, thanks in no small part to the Raspberry Pi family of products. The Pi has spawned lookalikes from several companies, which have al
10 min read
Charging Management Software For Electric Vehicles Why Do We Need Them?
Techfastly
Article
Charging Management Software For Electric Vehicles Why Do We Need Them?
Aug 1, 2022
4 min read
Arduino And Pi Together
Linux Format
Article
Arduino And Pi Together
Feb 11, 2020
The Arduino and Raspberry Pi are two very different products, but they both cater for eager hackers and makers. What if we could connect an Arduino to our Pi and use it as a slave device? One that reacts to input and sends the output to our Raspberry
3 min read
Learning Curve
CQ Amateur Radio
Article
Learning Curve
Aug 1, 2019
6 min read
Custom Embedded Linux Images
Linux Format
Article
Custom Embedded Linux Images
Jun 4, 2019
The Yocto Project (Yocto) www.yoctoproject.org is a system that uses the Linux kernel and packages contributed from the OpenEmbedded software team. The Yocto team points out that its product is not a Linux distribution, but instead builds custom dist
8 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
Thriving As An Ecosystem Partner
The European Business Review
Article
Thriving As An Ecosystem Partner
Sep 30, 2022
Researching ecosystems that span industries from e-commerce and publishing to semiconductors and healthcare over the past decade, we found companies that have been successful for years by contributing to an ecosystem. Sometimes, by contributing as pa
10 min read
A.i. Coding
Linux Format
Article
A.i. Coding
Aug 22, 2023
16 min read
Why Did Obama Just Honor Bug-free Software?
Nautilus
Article
Why Did Obama Just Honor Bug-free Software?
Dec 21, 2016
6 min read
Integrated Workplace Management Systems
Facility Management
Article
Integrated Workplace Management Systems
Dec 23, 2018
Property and facilities management are data-rich operating worlds. This is becoming even more complex as the Internet of Things (IoT) provides the capability to imbed sensors and diagnostic tools to monitor the use and performance of everything in re
4 min read
Contributing For Non - Coders
Linux Format
Article
Contributing For Non - Coders
Jan 10, 2023
9 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
Choices, Choices
Linux Format
Article
Choices, Choices
Apr 5, 2022
Matt Yonkovit is the head of Open Source Strategy at Percona “Many modern programs are built with dozens of different open source components, constructed like LEGO from pre-built blocks. This approach to picking and choosing the best tools and compon
1 min read
Choices, Choices
Linux Format
Article
Choices, Choices
Apr 5, 2022
Matt Yonkovit is the head of Open Source Strategy at Percona “Many modern programs are built with dozens of different open source components, constructed like LEGO from pre-built blocks. This approach to picking and choosing the best tools and compon
1 min read
In Conversation with Rajesh Dhuddu Global Head, Blockchain & Metaverse Practice, Tech Mahindra
Techfastly
Article
In Conversation with Rajesh Dhuddu Global Head, Blockchain & Metaverse Practice, Tech Mahindra
Nov 1, 2022
6 min read
PC Matic For Mac: Don’t Bother
MacWorld
Article
PC Matic For Mac: Don’t Bother
Feb 13, 2024
3 min read
AI And Design: Questions Of Ethics
Architecture Australia
Article
AI And Design: Questions Of Ethics
Mar 4, 2024
Artificial intelligence (AI) is a very old idea, but the term AI and the field of AI as it relates to modern programmable digital computing have taken their contemporary forms in the past 70 years.1Today, we interact with AI technologies constantly,
5 min read
Building PCs
Linux Format
Article
Building PCs
Apr 7, 2020
2 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Newsdesk
Linux Format
Article
Newsdesk
Mar 5, 2024
11 min read
Remote Support Software 2023
PC Pro Magazine
Article
Remote Support Software 2023
Sep 7, 2023
3 min read
Build A Club On The Next-gen Web
Linux Format
Article
Build A Club On The Next-gen Web
Aug 23, 2022
OUR EXPERT Onthe current web there are a few, enormous companies that dominate your activities and collect your data. For many of us this is a worrying development that we need to do something about. One technical solution is to develop a new versio
9 min read
Browser Wars 2020
Maximum PC
Article
Browser Wars 2020
May 26, 2020
8 min read
Browser Wars 2020
TechLife
Article
Browser Wars 2020
Aug 24, 2020
8 min read

Related categories

Skip carousel

Reviews for Real World Multicore Embedded Systems

Rating: 3 out of 5 stars

3/5

1 rating1 review

Rating: 3 out of 5 stars
3/5
Quite repetitive content, as multiple authors are projected on same concepts subjectively.

The book serves as a good reference tool for people experienced and new to working with multi core.

Book preview

Real World Multicore Embedded Systems - Bryon Moyer

1 Introduction and Roadmap

Bryon Moyer, Technology Writer and Editor, EE Journal

Chapter Outline

Multicore is here

Scope

Who should read this book?

Organization and roadmap

Concurrency

Architecture

High-level architecture

Memory architecture

Interconnect

Infrastructure

Operating systems

Virtualization

Multicore-related libraries

Application software

Languages and tools

Partitioning applications

Synchronization

Hardware assistance

Hardware accelerators

Synchronization hardware

System-level considerations

Bare-metal systems

Debug

A roadmap of this book

Multicore is here

Actually, multicore has been around for many years in the desktop and supercomputing arenas. But it has lagged in the mainstream embedded world; it is now here for embedded as well.

Up until recently, multicore within embedded has been restricted primarily to two fields: mobile (assuming it qualifies as an embedded system, a categorization that not everyone agrees with) and networking. There have been multiple computing cores in phones for some time now. However, each processor typically owned a particular job – baseband processing, graphics processing, applications processing, etc. – and did that job independently, with little or no interaction with other processing cores. So multicore really wasn’t an issue then. That’s changed now that application processors in smartphones have multiple cores: it’s time to start treating them as full-on multicore systems.

Meanwhile, networking (or, more specifically, packet-processing) systems have used multicore for a long time, well before any tools were available to ease the multicore job. This has been a highly specialized niche, with rockstar programmers deeply imbued with the skills needed to extract the highest possible performance from their code. This has meant handcrafting for specific platforms and manual programming from scratch. This world is likely to retain its specialty designation because, even as multicore matures, the performance requirements of these systems require manual care.

For the rest of embedded, multiple cores have become an unavoidable reality. And multicore has not been enthusiastically embraced for one simple reason: it’s hard. Or it feels hard. There’s been a huge energy barrier to cross to feel competent in the multicore world.

Some parts of multicore truly are hard, but as it reaches the mainstream, many of the issues that you might have had to resolve yourself have already been taken care of. There are now tools and libraries and even changes to language standards that make embedded multicore programming less of a walk in the wild.

And that’s where this book comes in. There have been people quietly working for years on solving and simplifying multicore issues for embedded systems. And let’s be clear: what works for desktops is not at all acceptable for embedded systems, with their limitations on size, resources and power. It has taken extra work to make some of the multicore infrastructure relevant to embedded systems.

Some of the people involved in those processes or simply with experience in using multicore have contributed from their vast knowledge to help you understand multicore for embedded. Most importantly, the intent is that, by taking in the various topics we’ve covered, you’ll cross over that energy barrier and be able to start doing the multicore work you need to do.

Scope

The term embedded system is broad and ill-defined. You probably know it when you see it, although community standards may vary. We won’t try to define what is included; it’s probably easier to say what isn’t included:

– desktop-style general computing (although desktop computers are sometimes harnessed for use in embedded applications)

– high-performance computing (HPC), the realm of supercomputers and massively parallel scientific and financial computing.

Many of the concepts discussed actually apply to both of those realms, but we will restrict examples and specifics to the embedded space, and there will be topics (like MPI, for example) that we won’t touch on.

Who should read this book?

This book is for anyone that will need to work on embedded multicore systems. That casts a wide net. It includes:

• Systems architects that are transitioning from single-core to multicore systems.

• Chip architects that have to implement sophisticated systems-on-chips (SoCs).

• Software programmers designing infrastructure and tools to support embedded multicore.

• Software programmers writing multicore applications.

• Software programmers taking sequential programs and rewriting them for multicore.

• Systems engineers trying to debug and optimize a multicore system.

This means that we deal with hardware, firmware/middleware, software, and tools. The one area we don’t deal with is actual hardware circuit design. We may talk about the benefits of hardware queuing, for example, but we won’t talk about how to design a hardware queue on a chip.

We have assumed that you are an accomplished engineer with respect to single-core embedded systems. So we’re not going to go too far into the realm of the basic (although some review is helpful for context). For example, we won’t describe how operating systems work in general – we assume you know that. We talk about those elements of operating systems that are specific to multicore.

Organization and roadmap

In order to give you the broad range of information that underpins multicore technology, we’ve divided the book into several sections. These have been ordered in a way that allows a linear reading from start to finish, but you can also dive into areas of particular interest directly if you have experience in the other areas. Each chapter is an independent essay, although we’ve tried to avoid too much duplication, so you will find some references from chapter to chapter.

Concurrency

We start with a review of the concept that underlies everything that matters in this book: concurrency. The reason everything changes with multicore is that our old assumption that one thing happens before another no longer holds true. More than one thing can happen at the same time, meaning we get more work done more quickly, but we also open up a number of significant challenges that we haven’t had to deal with before. Understanding concurrency in detail is important to making sense out of everything else that follows.

Architecture

The next section concedes the fact that hardware is expensive to design, therefore hardware platforms will be created upon which software will be written. It’s nice to think that embedded systems are a perfect marriage between purpose-built hardware and software that have been co-designed, but, in reality, especially when designing an expensive SoC, the hardware must serve more than just one application.

High-level architecture

This chapter takes a broad view of multicore hardware architecture as it applies to embedded systems. Author Frank Schirrmeister focuses on the arrangement of processing cores, but he must necessarily include some discussion of memory and interconnect as well. The intent of this chapter is to help you understand either how to build a better architecture or how to select an architecture.

Memory architecture

Memory can be arranged in numerous different ways, each of which presents benefits and challenges. In this chapter, author Gitu Jain picks up on the more general description in the high-level architecture chapter and dives deeper into the implications not only of main memory, but also of cache memory and the various means by which multiple caches can be kept coherent.

Interconnect

Processors and memory are important, but the means by which they intercommunicate can have a dramatic impact on performance. It can also impact the cost of a chip to the extent that simply over-provisioning interconnect is not an option. Sanjay Deshpande provides a broad treatment of the considerations and trade-offs that apply when designing or choosing an interconnect scheme.

Infrastructure

Once the hardware is in place, a layer of services and abstraction is needed in order to shield applications from low-level details. This starts with something as basic and obvious as the operating system, but includes specialized libraries supporting things like synchronization.

Operating systems

Operating systems provide critical services and access to resources on computing platforms. But with embedded systems, the number of options for operating systems is far larger than it is for other domains. Some of those options can’t deal with multicore; others can. Some are big and feature-rich; others are small or practically non-existent. And they each have different ways of controlling how they operate. So in this chapter, with the assistance of Bill Lamie and John Carbone, I discuss those elements of operating systems that impact their performance in multicore systems, including the ability of associated tools to assist with software-level debugging.

Virtualization

The increased need for security and robustness has made it necessary to implement varying levels of virtualization in embedded systems. Author David Kleidermacher describes the many different ways in which virtualization can be implemented, along with the benefits and drawbacks of each option.

Multicore-related libraries

The details of multicore implementation can be tedious and error-prone to put into place. A layer of libraries and middleware can abstract application programs away from those details, making them not only more robust and easier to write, but also more portable between systems that might have very different low-level features. Author Max Domeika takes us on a tour of those multicore-related resources that are available to help ease some of the burden.

Application software

The goal of a good multicore system is to make the underlying configuration details as irrelevant as possible to the application programmer. That’s less possible for embedded systems, where programmers work harder to optimize their software to a specific platform. But multicore raises specific new issues that cannot be ignored by programmers.

Languages and tools

Some languages are better than others at handling the parallelism that multicore systems make possible. That said, some languages are more popular than others without regard to their ability to handle parallelism. Author Gitu Jain takes us through a variety of languages, covering their appropriateness for multicore as well as their prevalence.

Meanwhile, tailoring an application for a multicore platform suggests analysis and tools that don’t apply for single-core systems. Author Kenn Luecke surveys a range of tools from a number of sources that are applicable to multicore design.

Partitioning applications

Since multicore systems can do more than one thing at a time, application programs that were once sequential in nature can be teased apart to keep multiple cores busy. But that process of splitting up a program can be very difficult. I’m assisted by Paul Stravers in this chapter that describes the issues surrounding partitioning and then shows you how to do it both manually and with new specialized tools designed to help.

Synchronization

The success of various pieces of a program running in parallel to yield a correct result depends strongly on good synchronization between those different pieces. This concept lies at the heart of what can make or break a multicore application. Author Tom Dickens runs through a litany of dos and don’ts for keeping an application program on track as it executes.

Hardware assistance

While there may appear to be a bright line defining what is expected in hardware and what should be in software, there are some gray areas. In some cases, hardware can assist with specific functions to improve performance.

Hardware accelerators

There are times when a hardware block can increase the performance of some compute-intensive function dramatically over what software can manage. Those accelerators can be designed to operate in parallel with the cores, adding a new concurrency dimension. In this chapter, Yosinori Watanabe and I discuss hardware accelerators and their considerations for integration into embedded multicore systems.

Synchronization hardware

Much of the bookkeeping and other details required for correct functioning of a multicore system is handled by low-level software services and libraries such as those described in the earlier chapter on multicore libraries. But hardware infrastructure can help out here as well – to the point of the processor instruction set having an impact. So author Jim Holt takes a look at some important low-level hardware considerations for improving embedded system performance.

System-level considerations

We close out with system-level optimization concepts: bare-metal systems and debugging. These chapters complement the preceding material, leveraging all of the concepts presented in one way or another.

Bare-metal systems

The performance of some systems – notably, the packet-processing systems mentioned at the outset of this introduction – is so critical that the overhead of an operating system cannot be tolerated. This suggests a completely different way of doing multicore design. Sanjay Lal and I describe both hardware and software considerations in such a system.

Debug

Finally, debug can be much more difficult in a multicore system than in a single-core one. It’s no longer enough to stop and start a processor and look at registers and memory. You have multiple cores, multiple clocks, multiple sets of interrupts and handlers, and multiple memories and caches, each of which may be marching to the beat of a different drummer. Successful debug relies heavily on the existence of robust control and observability features, and yet these features entail costs that must be balanced. Author Neal Stollon discusses those trade-offs and trends for multicore debug.

A roadmap of this book

Figure 1.1 illustrates the relationships between the various chapters and their relationship to system design. The particular depiction of the design process may feel oversimplified, and, in fact it is. But, as with everything embedded, each project and process is different, so this should be viewed as a rough abstraction for the purposes of depicting chapter relevance.

Figure 1.1 A rough depiction of embedded multicore system development.

Based on this model:

• Everyone should read the Concurrency chapter.

• System architects and designers can jump into the various platform-design-related chapters, with Architecture, Memory, and Interconnect being the most fundamental. Hardware synchronization is important if designing accelerated infrastructure.

• For those engineers adapting an OS or virtualization platform or writing drivers or adapting libraries, there are chapters specifically introducing those topics.

• For application writers, Partitioning and Synchronization are important chapters. Hardware accelerators are also included here because our focus is not on designing the hardware for the accelerator, but rather on using it within the system: invoking it in software, writing drivers for it, and handling the concurrency that it can bring. There is also an overview of languages and tools as they pertain to multicore to help make choices at the start of a project.

• Integration and verification engineers can tackle the Debug chapter.

• System designers and programmers trying to extract maximal performance can take on the Bare-metal chapter.

That said, everyone can benefit from the topics they’re not specifically involved in. A hardware designer can design a better platform if he or she understands the real-world issues that software programmers face either when writing an application or building firmware. A programmer writing an application can write better code if he or she understands the nuances and trade-offs involved in the platform that will run the code.

Our goal is to bring to you the experiences of those that have been tackling these issues before it was mainstream to do so. Each of their chapters encapsulates years of learning what works and what doesn’t work so that you don’t have to repeat those lessons. We hope that this serves you well.

Welcome to the age of embedded multicore.

Chapter 2 The Promise and Challenges of Concurrency

Bryon Moyer, Technology Writer and Editor, EE Journal

Chapter Outline

Concurrency fundamentals

Two kinds of concurrency

Data parallelism

Functional parallelism

Dependencies

Producers and consumers of data

Loops and dependencies

Shared resources

Summary

The opportunities and challenges that arise from multicore technology – or any kind of multiple processor arrangement – are rooted in the concept of concurrency. You can loosely conceive of this as more than one thing happening at a time. But when things happen simultaneously, it’s very easy for chaos to ensue. If you create an assembly line to make burgers quickly in a fast food joint, with one guy putting the patty on the bun and the next guy adding a dab of mustard, things will get messy if the mustard guy doesn’t wait for a burger to be in place before applying the mustard. Coordination is key, and yet, as obvious as this may sound, it can be extremely challenging in a complex piece of software.

The purpose of this chapter is to address concurrency and its associated challenges at a high level. Specific solutions to the problems will be covered in later chapters.

Concurrency fundamentals

It is first important to separate the notion of inherent concurrency and implemented parallelization. A given algorithm or process may be full of opportunities for things to run independently from each other. An actual implementation will typically select from these opportunities a specific parallel implementation and go forward with that.

For example, in our burger-making example, you could make burgers more quickly if you had multiple assembly lines going at the same time. In theory, given an infinite supply of materials, you could make infinitely many burgers concurrently. However, in reality, you only have a limited number of employees and countertops on which to do the work. So you may actually implement, say, two lines even though the process inherently could allow more. In a similar fashion, the number of processors and other resources drives the decision on how much parallelism to implement.

It’s critical to note, however, that a chosen implementation relies on the inherent opportunities afforded by the algorithm itself. No amount of parallelization will help an algorithm that has little inherent concurrency, as we’ll explore later in this chapter.

So what you end up with is a series of program sections that can be run independently punctuated by places where they need to check in with each other to exchange data – an event referred to as synchronization.

For example, one fast food employee can lay a patty on a bun completely independently from someone else squirting mustard on a different burger. During the laying and squirting processes, the two can be completely independent. However, after they’re done, each has to pass his or her burger to the next guy, and neither can restart with a new burger until a new one is in place. So if the mustard guy is a lot faster than the patty-laying guy, he’ll have to wait idly until the new burger shows up. That is a synchronization point (as shown in Figure 2.1).

Figure 2.1 Where the two independent processes interact is a synchronization point.

A key characteristic here is the fact that the two independent processes may operate at completely different speeds, and that speed may not be predictable. Different employees on different shifts, for example, may go at different speeds. This is a fundamental issue for parallel execution of programs. While there are steps that can be taken to make the relative speeds more predictable, in the abstract, they need to be considered unpredictable. This concept of a program spawning a set of independent processes with occasional check-in points is shown in Figure 2.2.

Figure 2.2 A series of tasks run mutually asynchronously with occasional synchronization points.

Depending on the specific implementation, the independent portions of the program might be threads or processes (Figure 2.3). At this stage, we’re really not interested in those specifics, so to avoid getting caught up in that detail, they are often generically referred to as tasks. In this chapter, we will focus on tasks; how those tasks are realized, including the definitions of SMP and AMP shown in the figure, will be discussed in later chapters.

Figure 2.3 Tasks can be different threads within a process or different processes.

Two kinds of concurrency

There are fundamentally two different ways to do more than one thing at a time: bulk up so that you have multiple processors doing the same thing, or use division of labor, where different processors do different things at the same time.

Data parallelism

The first of those is the easiest to explain. Let’s say you’ve got a four-bit vector that you want to operate on. Let’s make it really simple for the sake of example and say that you need to increment the value of every entry in the vector. In a standard program, you would do this with a loop:

This problem is exceedingly easy to parallelize. In fact, it belongs to a general category of problems whimsically called embarrassingly parallel (Figure 2.4) Each vector entry is completely independent and can be incremented completely independently. Given four processors, you could easily have each processor work on one of the entries and do the entire vector in ¼ the time it takes to do it on a single processor.

Figure 2.4 Embarrassingly parallel computation.

In fact, in this case, it would probably be even less than ¼ because you no longer have the need for an iterator – the i in the pseudocode above; you no longer have to increment i each time and compare it to 4 to see if you’re done (Figure 2.5).

Figure 2.5 Looping in a single core takes more cycles than multicore.

This is referred to as data parallelism; multiple instances of data can be operated on at the same time. The inherent concurrency allows a four-fold speed-up, although a given implementation might choose less if fewer than four processors are available.

Two key attributes of this problem make it so easy to parallelize:

– the operation being performed on one entry doesn’t depend on any other entry

– the number of entries is known and fixed.

That second one is important. If you’re trying to figure out how to exploit concurrency in a way that’s static – in other words, you know exactly how the problem will be parallelized at compile time – then the number of loop iterations must be known at compile time. A while loop or a for loop where the endpoint is calculated instead of constant cannot be so neatly parallelized because, for any given run, you don’t know how many parallel instances there might be.

Functional parallelism

The other way of splitting things up involves giving different processors different things to do. Let’s take a simple example where we have a number of text files and we want to cycle through them to count the number of characters in each one. We could do this with the following pseudo-program:

We can take three processors and give each of them a different task. The first processor opens files; the second counts characters; and the third closes files (Figure 2.6).

Figure 2.6 Different cores performing different operations.

There is a fundamental difference between this and the prior example of data parallelism. In the vector-increment example, we took a problem that had been solved by a loop and completely eliminated the loop. In this new example, because of the serial nature of the three tasks, if you only had one loop iteration, then there would be no savings at all. It only works if you have a workload involving repeated iterations of this loop.

As illustrated in Figure 2.7, when the first file is opened, the second and third processors sit idle. After one file is open, then the second processor can count the characters, while the third processor is still idle. Only when the third file is opened do all processors finally kick in as the third processor closes the first file. This leads to the descriptive term pipeline for this kind of arrangement, and, when executing, it doesn’t really hit its stride until the pipeline fills up. This is also referred to as loop distribution because the duties of one loop are distributed into multiple loops, one on each processor.

Figure 2.7 The pipeline isn’t full until all cores are busy.

This figure also illustrates the fact that using this algorithm on only one file provides no benefit whatsoever.

Real-world programs and algorithms typically have both inherent data and functional concurrency. In some situations, you can use both. For example, if you had six processors, you could double the three-processor pipeline to work through the files twice as fast. In other situations, you may have to decide whether to exploit one or the other in your implementation.

One of the challenges of a pipeline lies in what’s called balancing the pipeline. Execution can only go as fast as the slowest stage. In Figure 2.7, opening files is shown as taking longer than counting the characters. In that situation, counting faster will not improve performance; it will simply increase the idle time between files.

The ideal situation is to balance the tasks so that every pipeline stage takes the same amount of time; in practice, this is so difficult as to be more or less impossible. It becomes even harder when different iterations take more or less time. For instance, it will presumably take longer to count the characters in a bigger file, so really the times for counting characters above should vary from file to file. Now it’s completely impossible to balance the pipeline perfectly.

Dependencies

One of the keys to the simple examples we’ve shown is the independence of operations. Things get more complicated when one calculation depends on the results of another. And there are a number of ways in which these dependencies crop up. We’ll describe some basic cases here, but a complete theory of dependencies can be quite intricate.

It bears noting here that this discussion is intended to motivate some of the key challenges in parallelizing software for multicore. In general, one should not be expected to manually analyze all of the dependencies in a program in order to parallelize it; tools become important for this. For this reason, the discussion won’t be exhaustive, and will show concept examples rather than focusing on practical ways of dealing with dependencies, which will be covered in the chapter on parallelizing software.

Producers and consumers of data

Dependencies are easier to understand if you think of a program of consisting of producers and consumers of data (Figure 2.8). Some part of the program does a calculation that some other part will use: the first part is the producer and the second is the consumer. This happens at very fine-grained instruction levels and at higher levels, especially if you are taking an object-oriented approach – objects are also producers and consumers of data.

Figure 2.8 Producers and consumers at the fine- and coarse-grained level. Entities are often both producers and consumers.

At its most basic, a dependency means that a consumer of data must wait to consume its data until the producer has produced the data (Figure 2.9). The concept is straightforward, but the implications vary depending on the language and approach taken. At the instruction level, many compilers have been designed to exploit low-level concurrency, doing things like instruction reordering to make execution more efficient while making sure that no dependencies are violated.

Figure 2.9 A consumer cannot proceed until it gets its data from the producer.

It gets more complicated with languages like C that allow pointers. The concept is the same, but compilers have no way of understanding how various pointers relate, and so can’t do any optimization. There are two reasons why this is so: pointer aliasing and pointer arithmetic.

Pointer aliasing is an extremely common occurrence in a C program. If you have a function that takes a pointer to, say, an image as a parameter, that function may name the pointer imagePtr. If a program needs to call that function on behalf of two different images – say, leftImage and rightImage, then when the function is called with leftImage as the parameter, then leftImage and imagePtr will refer to the same data. When called for rightImage, then rightImage and imagePtr will point to the same data (Figure 2.10).

Figure 2.10 Different pointers may point to the same locations at different times.

This is referred to as aliasing because a given piece of data may be accessed by variables of different names at different times. There’s no way to know statically what the dependencies are, not only because the names look completely different, but also because they may change as the program progresses. Thorough dynamic analysis is required to understand the relationships between pointers.

Pointer arithmetic can also be an obvious problem because, even if you know where a pointer starts out, manipulating the actual address being pointed to can result in the pointer pointing pretty much anywhere (including address 0, which any C programmer has done at least once in his or her life). Where it ends up pointing may or may not correlate to a memory location associated with some other pointer (Figure 2.11).

Figure 2.11 Pointer arithmetic can cause a pointer to refer to some location in memory that may or may not be pointed to by some other pointer.

For example, when scanning through an array with one pointer to make changes, it may be very hard to understand that some subsequent operation, where a different pointer scans through the same array (possibly using different pointer arithmetic), will read that data (Figure 2.12). If the second scan consumes data that the first scan was supposed to put into place, then parallelizing those as independent will cause the program to function incorrectly. In many cases, this dependency cannot be identified by static inspection; the only way to tell is to notice at run time that the pointers address the same space.

Figure 2.12 Two pointers operating on the same array create a dependency that isn’t evident by static inspection.

These dependencies are based on a consumer needing to wait until the producer has created the data: writing before reading. The opposite situation also exists: if a producer is about to rewrite a memory location, you want to be sure that all consumers of the old data are done before you overwrite the old data with new data (Figure 2.13). This is called an anti-dependency. Everything we’ve discussed about dependencies also holds for anti-dependencies except that this is about waiting to write until all the reads are done: reading before writing.

Figure 2.13 The second pointer must wait before overwriting data until the first pointer has completed its read, creating an anti-dependency.

This has been an overview of dependencies; they will be developed in more detail in the Partitioning chapter.

Loops and dependencies

Dependencies become more complex when loops are involved – and in programs being targeted for parallelization – loops are almost always involved. We saw above how an embarrassingly parallel loop can be parallelized so that each processor gets one iteration of the loop. Let’s look at an example that’s slightly different from that example.

Note that in this and all examples like this, I’m ignoring what happens for the first iteration, since that detail isn’t critical for the discussion.

This creates a subtle change because each loop iteration produces a result that will be consumed in the next loop iteration. So the second loop iteration can’t start until the first iteration has produced its data. This means that the loop iterations can no longer run exactly in parallel: each of these parallel iterations is offset from its predecessor (Figure 2.14). While the total computation time is still less than required to execute the loop on a single processor, it’s not as fast as if there were no dependencies between the loop iterations. Such dependencies are referred to as loop-carry (or loop-carried) dependencies.

Figure 2.14 Even though iterations are parallelized, each must wait until its needed data is produced by the prior iteration, causing offsets that increase overall computation time above what would be required for independent iterations.

It gets even more complicated when you have nested loops iterating across multiple iterators. Let’s say you’re traversing a two-dimensional matrix using i to scan along a row and using j to scan down the rows (Figure 2.15).

Figure 2.15 4×4 array with i iterating along a row (inner loop) and j iterating down the rows (outer loop).

And let’s assume further that a given cell depends on the new value of the cell directly above it (Figure 2.16):

Figure 2.16 Each cell gets a new value that depends on the new value in the cell in the prior row.

First of all, there are lots of ways to parallelize this code, depending on how many cores we have. If we were to go as far as possible, we would need 16 cores since there are 16 cells. Or, with four cores, we could assign one row to each core.

If we did the latter, then we couldn’t start the second row until the first cell of the first row was calculated (Figure 2.17).

Figure 2.17 If each row gets its own core, then each row must wait until the first cell in the prior row is done before starting.

If we completely parallelized it, then we could start all of the first-row entries at the same time, but the second-row entries would have to wait until their respective first-row entries were done (Figure 2.18).

Figure 2.18 An implementation that assigns each cell to its own core.

Note that using so many cores really doesn’t speed anything up: using only four cores would do just as well since only four cores would be executing at any given time (Figure 2.19). This implementation assigns one column to each core, instead of one row, as is done in Figure 2.17. As a result, the loop can be processed faster because no core has to wait for any other core. There is no way to parallelize this set of nested loops any further because of the dependencies.

Figure 2.19 Four cores can implement this loop in the same time as 16.

Such nested loops give rise to the concept of loop distance. Each iterator gets a loop distance. So in the above example, in particular as shown in Figure 2.16, where the arrows show the dependency, the loop distance for i is 0 since there is no dependency; the loop distance for j is 1, since the data consumed in one cell depends on the cell directly above it, which is the prior j iteration. As a vector, the loop distance for i and j is [0,1].

If we changed the code slightly to make the dependency on j−2 instead of j−1:

then the loop distance for j is 2, as shown in Figure 2.20.

Figure 2.20 Example showing j loop distance of 2.

This means that the second row doesn’t have to wait for the first row, since it no longer depends on the first row. The third row, however, does have to wait for the first row (Figure 2.21). Thus we can parallelize further with more cores, if we wish, completing the task in half the time required for the prior example.

Figure 2.21 With loop distance of 2, two rows can be started in parallel.

While it may seem obscure, the loop distance is an important measure for synchronizing data. It’s not a matter of one core producing data and the other immediately consuming it; the consuming core may have to wait a number of iterations before consuming the data, depending on how things are parallelized. While it’s waiting, the producer continues with its iterations, writing more data. Such data can be, for example, written into some kind of first-in/first-out (FIFO) memory, and the loop distance determines how long that FIFO has to be. This will be discussed more fully in the Partitioning

Enjoying the preview?

Page 1 of 1

Real World Multicore Embedded Systems

About this ebook

Related to Real World Multicore Embedded Systems

Related ebooks

Hardware For You

Related podcast episodes

Related articles

Related categories

Reviews for Real World Multicore Embedded Systems

What did you think?

Book preview

Real World Multicore Embedded Systems - Bryon Moyer

1

Introduction and Roadmap

Chapter Outline

Multicore is here

Scope

Who should read this book?

Organization and roadmap

Concurrency

Architecture

High-level architecture

Memory architecture

Interconnect

Infrastructure

Operating systems

Virtualization

Multicore-related libraries

Application software

Languages and tools

Partitioning applications

Synchronization

Hardware assistance

Hardware accelerators

Synchronization hardware

System-level considerations

Bare-metal systems

Debug

A roadmap of this book

Chapter 2

The Promise and Challenges of Concurrency

Chapter Outline

Concurrency fundamentals

Two kinds of concurrency

Data parallelism

Functional parallelism

Dependencies

Producers and consumers of data

Loops and dependencies