Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Activity Recognition: Fundamentals and Applications
Activity Recognition: Fundamentals and Applications
Activity Recognition: Fundamentals and Applications
Ebook161 pages2 hours

Activity Recognition: Fundamentals and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

What Is Activity Recognition


Activity recognition is an approach that makes use of a number of observations on the activities of one or more agents, as well as the conditions of their surrounding environment, with the end goal of identifying the actions and objectives of those individuals. Because of its ability to offer personalized support for a wide variety of applications and its connections to a wide variety of other fields of study, such as medicine, human-computer interaction, or sociology, this area of research has attracted the attention of several communities within the field of computer science since the 1980s.


How You Will Benefit


(I) Insights, and validations about the following topics:


Chapter 1: Activity recognition


Chapter 2: Computer vision


Chapter 3: Artificial neural network


Chapter 4: Machine learning


Chapter 5: Gesture recognition


Chapter 6: Recurrent neural network


Chapter 7: Context awareness


Chapter 8: Transfer learning


Chapter 9: Convolutional neural network


Chapter 10: List of datasets for machine-learning research


(II) Answering the public top questions about activity recognition.


(III) Real world examples for the usage of activity recognition in many fields.


(IV) 17 appendices to explain, briefly, 266 emerging technologies in each industry to have 360-degree full understanding of activity recognition' technologies.


Who This Book Is For


Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of activity recognition.

LanguageEnglish
Release dateJul 6, 2023
Activity Recognition: Fundamentals and Applications

Read more from Fouad Sabry

Related to Activity Recognition

Titles in the series (100)

View More

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Activity Recognition

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Activity Recognition - Fouad Sabry

    Chapter 1: Activity recognition

    From a set of observations on the actions of the agents and the surrounding conditions, activity recognition attempts to deduce the agents' actions and goals. Many different areas of study, including medicine, human-computer interaction, and sociology, are connected to this area of study, which has attracted the attention of the computer science community since the 1980s.

    A wide variety of terms, including plan recognition, goal recognition, intent recognition, behavior recognition, location estimation, and location-based services, are used to describe activity recognition in various disciplines.

    In order to model a wide variety of human activities, sensor-based activity recognition combines the developing field of sensor networks with cutting-edge data mining and machine learning techniques.

    Due to the noisy nature of the input, activity recognition based on sensors is a difficult task. This direction has been driven primarily by statistical modeling in layers, where recognition is conducted and connected across a number of intermediate levels. Statistical learning is concerned with determining the precise locations of agents from the collected signal data at the lowest level where the sensors are located. Concerns about activity recognition from lower-level inferred location sequences and environmental conditions may be a focus of statistical inference at the intermediate level. Furthermore, at the most fundamental level, there is a concern with using a combination of logical and statistical reasoning to infer an agent's overall goal or subgoals from the activity sequences.

    ORL's work with active badge systems was the first to recognize activities for multiple users using on-body sensors. They look into the root issue of identifying user activities from home sensor data and propose a novel pattern mining approach that can identify both single-user and multi-user activities with a single implementation.

    Recognizing group behavior as an entity, rather than the activities of individual members within the group, is a key distinction between group activity recognition and single, or multi-user activity recognition.

    Logic-based methods catalog all possible rationalizations for the observed behavior. So, it's important to think about every strategy or objective that makes sense. Kautz formulated a theory of recognize plans formally. He explained plan recognition as a process of circumscribing logical inferences. All endeavors and schemes are referred to as objectives, and a recognizer's information is represented by a set of first-order statements known as an event hierarchy. First-order logic encodes the hierarchy of events by specifying their abstraction, decomposition, and functional relationships.

    When new events occur, inconsistent plans and objectives are constantly discarded. In addition, they demonstrated techniques for training a goal recognizer to accommodate unique user behaviors based on historical data. The direct argumentation model described by Pollack et al. can assess the validity of various arguments for describing beliefs and intentions.

    The inability or impossibility of logic-based approaches to represent uncertainty is a significant issue. As long as both plans are consistent enough to explain the observed actions, they are equally plausible in their eyes, and there is no way to make them favor one over the other. Logic-based approaches also have a low capacity for learning.

    Stream reasoning is a method for logic-based activity recognition. It is based on answer set programming, which models ambiguity and uncertainty with weak constraints.

    In order to reason about actions, plans, and goals in the face of uncertainty, activity recognition has recently adopted probability theory and statistical learning models. Several methods exist in the literature that take uncertainty into account when making inferences about an agent's plans and goals.

    Hodges and Pollack developed machine learning-based systems to recognize people doing common tasks like making coffee using data collected from sensors. Radio-frequency identification (RFID) and global positioning system data can be used in some of these works to infer how users are getting around (GPS).

    It has been demonstrated that temporal probabilistic models perform better than non-temporal models when it comes to activity recognition. Also providing good performance in activity recognition are discriminative models like Conditional Random Fields (CRF).

    There are benefits and drawbacks to both generative and discriminative models, and selecting the right one is context-specific. Here you can find a dataset along with implementations of several well-known models (HMM, CRF) for activity recognition.

    Hidden Markov models (HMMs) and conditional random fields (CRF) models are two examples of common temporal probabilistic models that directly model the correlations between activities and gathered sensor data. Evidence for using hierarchical models, which account for the rich hierarchical structure present in human behavioral data, has been mounting in recent years.

    Recently, data mining-based machine learning methods have been proposed as an alternative to more conventional methods. Activity recognition is posed as a pattern-based classification problem in Gu et al. To recognize sequential, interleaved, and concurrent activities in a unified solution, they proposed a data mining approach based on discriminative patterns, which describe significant changes between any two activity classes of data.

    GPS data can also be used for location-based activity recognition.

    The problem of analyzing agent behavior from footage captured by multiple cameras is both crucial and difficult. The most prominent method is called Computer Vision. Human-computer interaction, user interface design, robotics learning, and surveillance are just a few of the many domains that have found success with vision-based activity recognition. The ICCV and CVPR are two scientific conferences that regularly feature research on vision-based activity recognition.

    There has been a lot of effort put into visual activity recognition. Many different approaches have been tried by researchers, including optical flow, Kalman filtering, Hidden Markov models, etc., for use with single cameras, stereo cameras, and infrared cameras. Studies on this topic have covered a wide range of considerations, from following a single pedestrian to following a group of pedestrians to picking up dropped objects.

    Some researchers have been experimenting with RGBD cameras like the Microsoft Kinect to track human motion in recent years. Normal 2D cameras lack the third dimension (depth) that depth cameras provide. Real-time skeleton models of humans in various positions have been created using the data collected by these depth cameras. Researchers have used the skeleton data to create models of human activities, which are then trained to recognize novel behaviors.

    Rapid progress in RGB video-based activity recognition can be attributed to the recent emergency of deep learning. As input, it takes videos from RGB cameras and performs tasks like classifying them, finding when activities begin and end, and pinpointing their location in time and space.

    Vision-based activity recognition has come a long way, but it is still a long way from being useful in most practical visual surveillance applications. When it comes to recognizing human actions, however, the human brain appears to have reached a state of perfection. This skill depends on more than just head knowledge; it also necessitates the ability to recognize which bits of information are most pertinent in a given scenario and the use of sound reasoning. This finding prompted the suggestion of incorporating commonsense reasoning and contextual and common-sense knowledge into vision-based activity recognition systems.

    The computational process of vision-based activity recognition is typically broken down into four stages: human detection, human tracking, human activity recognition, and finally a high-level activity evaluation.

    Fine-grained action localization is commonly used in computer vision-based activity recognition to provide per-image segmentation masks separating the human object and its action category (e.g., Segment-Tube).

    A person's unique gait can be used as a reliable identifier. A person's gait or gait feature profile can be recorded using gait-recognition software and then used to identify them later, even if they are disguised.

    There is a lot of noise and uncertainty in indoor and urban Wi-Fi signal and 802.11 access point-based activity recognition. A dynamic Bayesian network model can be used to account for these ambiguities.

    Wi-Fi activity recognition takes into account the signal's reflection, diffraction, and scattering as it passes through a human body. These signals provide data that can be used by scientists to study human bodily function.

    As demonstrated in, various effects, including reflection, scattering, diffraction, and diffraction, occur when wireless signals are transmitted indoors due to obstacles like walls, the ground, and the human body. Because surfaces reflect the signal during transmission, the receiving end experiences what is known as the multipath effect, which is the simultaneous reception of multiple signals from different paths.

    The direct signal and the reflected signal form the basis of the static model. Due to the lack of interference, Friis' transmission equation can be used to describe the propagation of direct signals:

    {\displaystyle P_{r}={\frac {P_{t}G_{t}G_{r}\lambda ^{2}}{(4\pi )^{2}d^{2}}}}

    P_{t} is the power fed into the transmitting antenna input terminals; P_{r} is the power available at receiving antenna output terminals; d is the distance between antennas; G_{t} is transmitting antenna gain; G_{r} is receiving antenna gain; \lambda is the wavelength of the radio frequency

    The revised equation, taking into account the reflected signal, is:

    {\displaystyle P_{r}={\frac {P_{t}G_{t}G_{r}\lambda ^{2}}{(4\pi )^{2}(d+4h)^{2}}}}

    h is the distance between reflection points and direct path.

    As soon as a human is present, a new line of communication opens up. Consequently, the ultimate equation is:

    {\displaystyle P_{r}={\frac {P_{t}G_{t}G_{r}\lambda ^{2}}{(4\pi )^{2}(d+4h+\Delta )^{2}}}}

    \Delta is the approximate difference of the path caused by human body.

    This model takes into account human motion, which results in a constantly shifting signal transmission path. Doppler Shift, which depends on the velocity of the moving object, can be used to describe this phenomenon.

    {\displaystyle \Delta f={\frac {2v\cos \theta }{c}}f}

    Human activity can be further confirmed by calculating the Doppler Shift of the received signal in order to ascertain the movement pattern. In one study, the Doppler effect was used as a fingerprint to accurately identify nine distinct types of motion.

    The wireless signal transmission model was developed from earlier research into light interference and diffraction in the Fresnel zone. The Fresnel zone is an elliptical region where the sender and receiver are located.

    The signal path formed by the reflection of the human body shifts as one moves through different Fresnel zones; this shift occurs at regular intervals when people move vertically through Fresnel zones. In the paper, they use the Fresnel model to improve accuracy in the activity recognition task.

    Better performance can be attained in some tasks if the human body is accurately modeled. To detect breath, the human body is likened to a series of concentric cylinders. Inhalation occurs on the outside of the cylinder, while exhalation occurs on the inside. Therefore, the distance traveled during breathing is equal to the difference in the radii of those two cylinders. The following equation describes the phase shift in a signal:

    {\displaystyle \theta =2\pi {\frac {2\,\Delta d}{\lambda }}}

    \theta is the change of the signal phases; \lambda is the wavelength of the radio frequency; \Delta d is moving distance of rib cage; Activity recognition and action recognition algorithms can be evaluated on a variety of popular datasets.

    There are over 13,000 clips in UCF-101, which contains over 27 hours of footage of human action classes. Makeup application, the dhol, cricket shot, facial hair removal, and other action classes.

    This is HMDB51, a compilation of real-world videos shot by a wide range of filmmakers, including video from the internet and film.

    The

    Enjoying the preview?
    Page 1 of 1