Biostatistics and Computer-based Analysis of Health Data using Stata
By Christophe Lalanne and Mounir Mesbah
()
About this ebook
This volume of the Biostatistics and Health Sciences Set focuses on statistics applied to clinical research. The use of Stata for data management and statistical modeling is illustrated using various examples. Many aspects of data processing and statistical analysis of cross-sectional and experimental medical data are covered, including regression models commonly found in medical statistics. This practical book is primarily intended for health researchers with basic knowledge of statistical methodology. Assuming basic concepts, the authors focus on the practice of biostatistical methods essential to clinical research, epidemiology and analysis of biomedical data (including comparison of two groups, analysis of categorical data, ANOVA, linear and logistic regression, and survival analysis). The use of examples from clinical trials and epideomological studies provide the basis for a series of practical exercises, which provide instruction and familiarize the reader with essential Stata packages and commands.
- Provides detailed examples of the use of Stata for common biostatistical tasks in medical research
- Features a work program structured around the four previous chapters and a series of practical exercises with commented corrections
- Includes an appendix to help the reader familiarize themselves with additional packages and commands
- Focuses on the practice of biostatistical methods that are essential to clinical research, epidemiology, and analysis of biomedical data
Christophe Lalanne
Christophe Lalanne is a Research Engineer at the Paris-Diderot University, France. His research involves the modeling of data from clinical research
Read more from Christophe Lalanne
Biostatistics and Computer-based Analysis of Health Data using R Rating: 0 out of 5 stars0 ratingsBiostatistics and Computer-based Analysis of Health Data Using SAS Rating: 0 out of 5 stars0 ratings
Related to Biostatistics and Computer-based Analysis of Health Data using Stata
Related ebooks
Biostatistics Explored Through R Software: An Overview Rating: 4 out of 5 stars4/5Practical Biostatistics: A Friendly Step-by-Step Approach for Evidence-based Medicine Rating: 5 out of 5 stars5/5Medical Statistics Made Easy, fourth edition Rating: 5 out of 5 stars5/5SPSS for Applied Sciences: Basic Statistical Testing Rating: 3 out of 5 stars3/5Statistics: Basic Principles and Applications Rating: 0 out of 5 stars0 ratingsExploratory and Multivariate Data Analysis Rating: 0 out of 5 stars0 ratingsSurviving Statistics: A Professor's Guide to Getting Through Rating: 0 out of 5 stars0 ratingsBiostatistics Decoded Rating: 0 out of 5 stars0 ratingsData Preparation and Exploration: Applied to Healthcare Data Rating: 0 out of 5 stars0 ratingsSPSS for you Rating: 4 out of 5 stars4/5Thinking Statistically Rating: 5 out of 5 stars5/5Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models Rating: 5 out of 5 stars5/5Regression Models for Categorical, Count, and Related Variables: An Applied Approach Rating: 0 out of 5 stars0 ratingsMultivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6 Rating: 0 out of 5 stars0 ratingsA Quick and Easy Guide in Using SPSS for Linear Regression Analysis Rating: 0 out of 5 stars0 ratingsAnalysis of Clinical Trials Using SAS: A Practical Guide, Second Edition Rating: 0 out of 5 stars0 ratingsApplied Logistic Regression Rating: 5 out of 5 stars5/5Applied Statistical Modeling and Data Analytics: A Practical Guide for the Petroleum Geosciences Rating: 5 out of 5 stars5/5Categorical Data Analysis Using SAS, Third Edition Rating: 0 out of 5 stars0 ratingsIntroduction to Robust Estimation and Hypothesis Testing Rating: 0 out of 5 stars0 ratingsBig Data in Healthcare: Statistical Analysis of the Electronic Health Record Rating: 0 out of 5 stars0 ratingsUncertainty Quantification and Stochastic Modeling with Matlab Rating: 0 out of 5 stars0 ratingsBiostatistics for Medical and Biomedical Practitioners Rating: 0 out of 5 stars0 ratingsData Analysis with Stata Rating: 5 out of 5 stars5/5Methods and Applications of Longitudinal Data Analysis Rating: 0 out of 5 stars0 ratingsStatistics Can Be Fun Rating: 0 out of 5 stars0 ratingsIntroduction to Biostatistics with JMP (Hardcover edition) Rating: 1 out of 5 stars1/5Practical Statistics Simply Explained Rating: 4 out of 5 stars4/5Biostatistics, 4e: The Bare Essentials Rating: 5 out of 5 stars5/5SPSS A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratings
Enterprise Applications For You
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Bitcoin For Dummies Rating: 4 out of 5 stars4/5Learn Windows PowerShell in a Month of Lunches Rating: 0 out of 5 stars0 ratingsCreating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Excel Formulas and Functions 2020: Excel Academy, #1 Rating: 4 out of 5 stars4/5101 Ready-to-Use Excel Formulas Rating: 4 out of 5 stars4/5Enterprise AI For Dummies Rating: 3 out of 5 stars3/5The New Email Revolution: Save Time, Make Money, and Write Emails People Actually Want to Read! Rating: 5 out of 5 stars5/5Microsoft Power Platform A Deep Dive: Dig into Power Apps, Power Automate, Power BI, and Power Virtual Agents (English Edition) Rating: 0 out of 5 stars0 ratingsExcel 2019 Bible Rating: 4 out of 5 stars4/5Excel Guide for Success Rating: 5 out of 5 stars5/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsExcel 2019 For Dummies Rating: 3 out of 5 stars3/5QuickBooks 2023 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsExperts' Guide to OneNote Rating: 5 out of 5 stars5/5Building Web Services with Microsoft Azure Rating: 0 out of 5 stars0 ratingsExcel Formulas That Automate Tasks You No Longer Have Time For Rating: 5 out of 5 stars5/5Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program Rating: 4 out of 5 stars4/550 Useful Excel Functions: Excel Essentials, #3 Rating: 5 out of 5 stars5/5QuickBooks Online For Dummies Rating: 0 out of 5 stars0 ratingsQuickBooks 2021 For Dummies Rating: 0 out of 5 stars0 ratingsExcel Tips and Tricks Rating: 0 out of 5 stars0 ratingsLearning Microsoft Azure Rating: 4 out of 5 stars4/5Managing Humans: Biting and Humorous Tales of a Software Engineering Manager Rating: 4 out of 5 stars4/5The Ridiculously Simple Guide to Google Docs: A Practical Guide to Cloud-Based Word Processing Rating: 0 out of 5 stars0 ratings
Reviews for Biostatistics and Computer-based Analysis of Health Data using Stata
0 ratings0 reviews
Book preview
Biostatistics and Computer-based Analysis of Health Data using Stata - Christophe Lalanne
Biostatistics and Computer-based Analysis of Health Data using Stata
Christophe Lalanne
Mounir Mesbah
Biostatistics and Health Science Set
coordinated by
Mounir Mesbah
Table of Contents
Cover image
Title page
Copyright
Introduction
1: Language Elements
Abstract
1.1 Data representation in Stata
1.2 Descriptive univariate statistics and estimation
1.3 Bivariate descriptive statistics
1.4 Key points
1.5 Further reading
1.6 Applications
2: Measures of Association, Comparisons of Means and Proportions for Two Samples or More
Abstract
2.1 Comparisons of two group means
2.2 Comparaisons of two proportions
2.3 Risk measures and OR
2.4 Analysis of variance
2.5 Key points
2.6 Further reading
2.7 Applications
3: Linear Regression
Abstract
3.1 Measures of association between two numeric variables
3.2 Linear regression
3.3 Multiple linear regression
3.4 Key points
3.5 Further reading
3.6 Applications
4: Logistic Regression and Epidemiological Analyses
Abstract
4.1 Measures of association in epidemiology
4.2 Logistic regression
4.3 Key points
4.4 Further reading
4.5 Applications
5: Survival Data Analysis
Abstract
5.1 Data representation and descriptive statistics
5.2 Descriptive statistics
5.3 Survival function and Kaplan–Meier curve
5.4 Cox regression
5.5 Key points
5.6. Further reading
5.7 Applications
Bibliography
Index
Copyright
First published 2016 in Great Britain and the United States by ISTE Press Ltd and Elsevier Ltd
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Press Ltd
27-37 St George’s Road
London SW19 4EU
UK
www.iste.co.uk
Elsevier Ltd
The Boulevard, Langford Lane
Kidlington, Oxford, OX5 1GB
UK
www.elsevier.com
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
For information on all our publications visit our website at http://store.elsevier.com/
© ISTE Press Ltd 2016
The rights of Christophe Lalanne and Mounir Mesbah to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
Library of Congress Cataloging in Publication Data
A catalog record for this book is available from the Library of Congress
ISBN 978-1-78548-142-0
Printed and bound in the UK and US
Introduction
A large number of the actions performed by means of statistical software are essentially forms of manipulating, or even literally transforming digital data representing statistical data. It is therefore paramount to fully understand how statistical data are represented and how they can be employed by software such as Stata. After the importing, recoding and the eventual transformation of these data, the description of the variables of interest and the summary of their distribution in numerical and graphical form constitute a fundamental preparatory stage to any statistical modeling, hence the importance of these early stages in the progress of a project for statistical analysis. In a second step, it is essential to fully control the commands that enable the calculation of the main measures of association in medical research, and to know how to implement the conventional explanatory and predictive models: analysis of variance, linear and logistic regression and the Cox model. With a few exceptions, making use of the Stata commands available during the installation of the software (base commands) will be preferred over the usage of specialized libraries of commands.
This book assumes that the reader is already familiar with basic statistical concepts, in particular the calculation of central tendency and dispersion indicators for a continuous variable, contingency tables, analysis of variance and conventional regression models. The objective here is to apply this knowledge to datasets described in numerous other works, even if the interpretation of the results remains minimal, in order to quickly familiarize oneself with the use of Stata with actual data. Emphasis is particularly given to the management and the manipulation of structured data since it can be noted that this constitutes 60–80% of the work of the statistician. There are many books in French or in English on Stata, covering both the technical and the statistical point of view. Some of these works show a dominant generalistic nature [ACO 14, HAM 13, RAB 04], while others are much more specialized and address similar topics, such as [FRY 14, DUP 09, VIT 05]. The purpose of this book is to enable the reader to quickly become accustomed to Stata, so that they can perform their own analyses and continue learning in an autonomous way in the field of medical statistics.
This book constitutes a sequel to the book Biostatistics and Computer Analysis of Health Data using R [LAL 16], published by the same authors in the same collection. Every topic that relates to data organization and data exploratory analysis, in particular graphical methods, are discussed therein. In this book, the same data sets are being used to facilitate the transfer of learning of the knowledge acquired in R.
In Chapter 1, the base commands for data management with Stata will be introduced. This primarily concerns the creation and the manipulation of quantitative and qualitative variables (recoding of individual values, counting of missing observations), importing databases stored in the form of text files, as well as elementary arithmetic operations (minimum, maximum, arithmetic mean, difference, frequency, etc.). We will also examine how to store preprocessed databases in text or in Stata formats. The objective is to understand how data are represented in Stata and how to work with them. The useful commands for describing a data table composed of quantitative or qualitative variables are also presented. The descriptive approach is strictly univariate, which constitutes the prerequisite for any statistical approach. Base graphic commands (histograms, density curves, bar or dot plots) will be presented in addition to the usual central tendency (mean, median) and dispersion (variance, quartiles) numerical descriptive summaries. Pointwise and interval estimation using arithmetic means and empirical proportions will also be addressed. The objective is to become familiar with the use of simple Stata commands operating on a variable, optionally specifying certain options for the calculation, alongside the selection of statistical units among all of the available observations.
Chapter 2 is dedicated to the comparison of two samples for quantitative or qualitative measurements. The following hypothesis tests are addressed: the Student's test for independent or paired samples, the non-parametric Wilcoxon test, the χ² test and the Fisher's exact test, and the NcNemar test based on the main measures of association for two variables (average difference, odds ratio and relative risk). From this chapter onwards, there will be less emphasis on the univariate description of each variable, but it is advisable to always carry out the stages of data description discussed in this chapter. The objective is to control the main statistical tests in the case where the relationship between a quantitative variable and a qualitative variable, or for two qualitative variables, is the main interest. This chapter also presents analysis of variance (ANOVA) where we explain the variability observed at the level of a numerical response variable by taking a group or classification factor into account, and the estimation with confidence intervals of average differences. Emphasis will be placed on the construction of an ANOVA table summarizing the various sources of variability, and on the graphic methods that can be used to summarize the distribution of individual or aggregated data. The linear tendency test will also be studied when the classification factor can be considered as naturally ordered. The objective is to understand how to construct an explanatory model in the case where there is one or even two explanatory factors, and how to digitally and graphically present the results of such a model through the use of Stata.
Chapter 3 focuses on the analysis of the linear relation between two continuous quantitative variables. In the linear correlation approach, which assumes a symmetrical relation between the two variables, the