Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series
SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series
SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series
Ebook481 pages5 hours

SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series

Rating: 0 out of 5 stars

()

Read preview

About this ebook

 

• 645 SAS Interview Questions
• 113 HR Interview Questions
• Real life scenario based questions
• Strategies to respond to interview questions
• Free 2 Aptitude Tests online


SAS Interview Questions You'll Most Likely Be Asked is designed to include all the possible SAS interview questions that exist. This book includes 215 SAS Programming Guidelines, 215 Base SAS and 215 Advanced SAS interview questions along with detailed answers and proven strategies for getting hired as an IT professional. Apart from the technical questions, this value pack includes 113 Human Resource interview questions to give impressive answers that help nail the job interview. All this makes it a complete value-for-money purchase.

The following is included in this book:

  • 645 SAS Interview Questions, Answers and proven strategies for getting hired as an IT professional
  • Dozens of examples to respond to interview questions
  • 113 HR Questions with Answers and proven strategies to give specific, impressive, answers that help nail the interviews
  • 2 Aptitude Tests


About the Series
SAS Interview Questions You'll Most Likely Be Asked is a part of Job Interview Questions Series. As technology now-a-days changes very often, IT Professionals need to be updated with the latest trends in these technologies constantly and more importantly instantly. Job Interview Questions Series is THE answer to this need.

We believe in delivering quality content and do so by tying up with the best authors around the globe. This series of books is written by expert authors and programmers who have been conducting interviews since a decade or more and have gathered vast experiences in the world of information technology. Unlike comprehensive, textbook-sized reference guides, our books include only the required information for job search. Hence, these books are short, concise and ready-to-use by the working professionals.

LanguageEnglish
Release dateJul 12, 2019
ISBN9781949395136
SAS Interview Questions You’ll Most Likely Be Asked: Job Interview Questions Series

Read more from Vibrant Publishers

Related to SAS Interview Questions You’ll Most Likely Be Asked

Related ebooks

Applications & Software For You

View More

Related articles

Reviews for SAS Interview Questions You’ll Most Likely Be Asked

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    SAS Interview Questions You’ll Most Likely Be Asked - Vibrant Publishers

    SAS Interview Questions You’ll Most Likely Be Asked

    Job Interview Questions Series

    Vibrant Publishers

    Published by Vibrant Publishers, 2019.

    While every precaution has been taken in the preparation of this book, the publisher assumes no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.

    SAS INTERVIEW QUESTIONS YOU’LL MOST LIKELY BE ASKED

    First edition. July 12, 2019.

    Copyright © 2019 Vibrant Publishers.

    ISBN: 978-1949395136

    Written by Vibrant Publishers.

    SAS Interview Questions You'll Most Likely Be Asked

    Job Interview Questions Series

    www.vibrantpublishers.com

    *****

    SAS Interview Questions You'll Most Likely Be Asked

    Copyright 2021, By Vibrant Publishers, USA. All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior permission of the publisher.

    This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. The author has made every effort in the preparation of this book to ensure the accuracy of the information. However, information in this book is sold without warranty either expressed or implied. The Author or the Publisher will not be liable for any damages caused or alleged to be caused either directly or indirectly by this book.

    Vibrant Publishers books are available at special quantity discount for sales promotions, or for use in corporate training programs. For more information please write to bulkorders@vibrantpublishers.com

    Please email feedback/corrections (technical, grammatical or spelling) to spellerrors@vibrantpublishers.com

    To access the complete catalogue of Vibrant Publishers, visit www.vibrantpublishers.com

    *****

    Dear Reader,

    Thank you for purchasing SAS Interview Questions You'll Most Likely Be Asked. We are committed to publishing books that are content-rich, concise and approachable enabling more readers to read and make the fullest use of them. We hope this book provides the most enriching learning experience as you prepare for your interview.

    Should you have any questions or suggestions, feel free to email us at reachus@vibrantpublishers.com

    Thanks again for your purchase. Good luck with your interview!

    – Vibrant Publishers Team

    *****

    SAS Interview Questions

    Review these typical interview questions and think about how you would answer them. Read the answers listed; you will find best possible answers along with strategies and suggestions.

    *****

    SAS Programming Guidelines

    *****

    Efficient SAS Programming

    1: How do you achieve scalability in SAS programming?

    Answer:

    SAS program scalability can be achieved in 2 ways - by scaling up and scaling out. Scalability is ensuring the lowest time to solution, especially for the most vital tasks. Typically, when you want to speed up the task completion, you either try to complete multiple processes at the same time or distribute the task across various processors and do parallel processing. This, sometimes, involve overlapping of certain processes. Scaling up requires better hardware that is capable of multiprocessing which is known as symmetric multiprocessing or SMP. Scaling out requires more servers that can handle distributed processing.

    2: How do SQL Views help better efficiency?

    Answer:

    A View typically consists of a subset of the entire table and hence is more efficient as it accesses a smaller set of data which is required. View also lets you hide the sensitive columns and complex queries from the user by choosing only what needs to be shown. Views always fetch fresh data from the table as they do not store any data.

    3: What do you know about the SPD Engine?

    Answer:

    The SPD Engine or the SAS Scalable Performance Data Engine is developed for SAS 9 to speed up the processing of large data sets by splitting them into smaller physical files called partitions. There are several parallel processors that have exclusive access to each partition and process them in parallel using threads. Partitions are created when the SAS data sets are created. When a WHERE clause is mentioned it is split across the partitions and processed in parallel. Data blocks are also read in parallel. Multiple connections are created based on the partitions which further reduces the I/O bottlenecks. The SPD Engine also does an implicit sort if the query contains a by clause.

    4: What resources are used to run a SAS program?

    Answer:

    The six resources used to run a SAS program are:

    a) Programmer time – The amount of time taken by the programmer for writing, testing and maintaining the program

    b) Real time – The time elapsed while executing a job

    c) CPU time – The amount of time the CPU takes to perform a task. The task can be reading data, writing data, calculations or implementation of a logic

    d) Memory – The work area memory space used for holding executable programs, data, etc

    e) Data storage space – The disk space for storing the data. This is measured in terms of bytes, kilobytes, gigabytes etc.

    f) I/O – The read and write operations performed to movie data from the memory to any output device, and vice versa

    5: List the factors that need to be considered while assessing the technical environment.

    Answer:

    The four factors that need to be considered while assessing a technical environment are:

    a) Hardware – Available memory, number of CPU’s, number of devices connected, network bandwidth, I/O bandwidth, and capability to upgrade

    b) Operating environment – The resource allocation & I/O methods

    c) System load – This includes the number of users sharing the system, the network traffic, and the predicted increase in load

    d) SAS environment – includes all SAS software products installed, number of CPU’s, and memory allocated for SAS programming

    6: Explain the functionality of the system option STIMER in the Windows environment.

    Answer:

    STIMER option in the Windows environment specifies that CPU time and real time statistics are tracked and written to the SAS log throughout the SAS session.

    Example: The following line of code turns on the STIMER option.

    options stimer;

    7: What is the function of the option FULLSTIMER in the Windows operating environment?

    Answer:

    FULLSTIMER option in the Windows environment specifies that all the available resource usage statistics needs to be tracked and written to the SAS log throughout the SAS session.

    Example:

    options fullstimer;

    8: Explain the MEMRPT option.

    Answer:

    The MEMRPT option in the z/OS environment specifies that the memory usage statistics are tracked and written to the SAS log throughout the SAS session. This is not available as a separate option in the Windows operating environment.

    9: While benchmarking the programming techniques in SAS, why is it necessary to execute each programming technique in separate sessions?

    Answer:

    It is always necessary to execute each programming technique in separate SAS sessions while benchmarking them the first time a program is read because the operating system might load the code into the cache and retrieve it from the cache when it is referenced. This takes less time. The resource usage necessary to perform this action is referred to as overhead. Using separate sessions minimizes the effect of overhead on resource statistics.

    10: While doing benchmark tests, when is it advisable to run the code for each programming technique several times?

    Answer:

    It is advised to run the code for each programming technique several times while benchmarking tests if the system is executing other jobs at the same time. Running the code several times reduces variations in the resource consumption associated with the task and so the average resource usage is known.

    11: How do you turn off the FULLSTIMER option?

    Answer:

    The FULLSTIMER option can be turned off with the following line of code.

    options nofullstimer;

    12: What steps can be taken to reduce the programmer time?

    Answer:

    Programmer time is the amount of time required for the programmer to determine the specifications, write, submit, test and maintain the program. It is difficult to calculate the exact time, but it can be reduced by the use of well-documented programming practices and reuse of SAS code modules.

    *****

    Memory Usage

    13: What is PDV? How does it work?

    Answer:

    PDV or Program Data Vector is a memory area created after the input buffer is created. Two extra variables _N_ and _Error_ are created by the SAS engine during compilation. These variables are used for processing but never written into the data set. SAS creates a PDV for each observation.

    14: How would you choose between DATA step and PROC SQL?

    Answer:

    With small data sets, PROC SQL works better since it loads the entire data set into the memory and works with the data. So there’s less need to go back and forth into the database. But with large data sets DATA step will work better as loading the entire data set with PROC SQL will block a huge chunk of memory. DATA step will always take one record at a time and hence, the number of records or large volume of data will not matter as long as the database connectivity remains good.

    15: Explain memory management in SAS.

    Answer:

    SAS, unlike Java and .Net, does not have garbage collection for memory management. But it does accomplish the job with a series of instructions called steps. Memory is allocated when the step begins and released when the step completes. This way, there’s no memory loosely allocated during the runtime. When dealing with large volumes of data, there may be cases when ample memory is not available. In such cases, SAS pushes an error message that memory not available, which is logged for reference. The hash objects in SAS lets you handle considerable amount of objects quickly. The DATA step is also efficient in memory management as it takes only one record at a time. Since most of the SAS programs depend upon a Work Area which they use to store objects temporarily, this area typically runs out of memory which needs to be handled efficiently.

    16: What is the sequence of actions performed in the background while trying to create a data set from another data set?

    Answer:

    While creating a data set from another data set the following actions take place in the background:

    a) The data gets copied from the input data set to a buffer in memory

    b) From the input buffer an observation at a time is written to PDV (Program Data Vector)

    c) Each observation from PDV is written to output buffer when processing is complete

    d) The contents of the output buffer are written to disk when the buffer is full.

    17: Define PAGE and PAGESIZE.

    Answer:

    A PAGE is a unit that indicates the data transfer between a storage device and PAGESIZE is the amount of data that can be transferred to one buffer in a single I/O operation.

    18: What procedure is used to indicate the PAGESIZE of a data set?

    Answer:

    The CONTENTS procedure is used to know the PAGESIZE associated with a data set.

    Example: The following CONTENTS procedure issues a message to SAS log indicating the PAGESIZE associated with the data set exam.clinic1. This also gives the number of data set pages.

    proc contents data = exam.clinic1;1

    run;

    19: Is it possible to control the PAGESIZE of an output data set?

    Answer:

    It is possible to control the PAGESIZE of an output data set by using BUFSIZE= option, which specifies the PAGESIZE in bytes.

    Example: The following program creates a data set exam.clinic1 from the data set exam.clinic2. In the following program the BUFSIZE= option specifies a PAGESIZE of 30720 bytes.

    options bufsize=30720;

    libname exam ‘c:\myprog’;

    data exam.clinic1

    set exam.clinic2;

    run;

    20: What is the default value of the BUFSIZE= option?

    Answer:

    The default value of the BUFSIZE= option is 0. If BUFSIZE= option is set to zero SAS uses the optimal page size determined by SAS for that operating environment.

    21: Is it necessary to specify the BUFSIZE= option every time a data set is processed?

    Answer:

    No. The BUFSIZE= option is set at the time of creation of data set, and that value of becomes a permanent attribute of the data set. Once it is specified it is used every time the data set is processed.

    22: What does the BUFNO= option signify?

    Answer:

    The BUFNO= option is used along with a SAS data set to lay down how many buffers are available for reading, writing, or updating. The larger the value of BUFNO= the faster the input/output function would be since more values will be stored in the buffer which avoids an actual input/output function. You can specify a larger number of pages to include in the BUFNO= and accordingly that many pages will be loaded into the memory.

    Example: The following program creates a data set MyExam.MyClinic from the data set MyExam.MyClinic2 in the following program, the BUFNO= option is given the value 6, that denotes 6 buffers.

    options bufno=6;

    libname exam ‘D:\MyProgram’;

    data MyExam.MyClinic

    set MyExam.MyClinic2;

    run;

    23: How do you set the BUFNO= option to the maximum possible number?

    Answer:

    To set the maximum value to BUFNO= option, you can set BUFNO= MAX which sets the maximum buffer value available in the current operating environment. The largest possible value of MAX would be approximately 2 billion (231-1).

    Example: The following program creates a data set MyExam.MyClinic from the data set MyExam.MyClinic2. In the following program, the BUFNO= option is given the value MAX, that denotes the maximum buffer available in the current environment.

    options bufno=max;

    libname exam ‘D:\MyProgram’;

    data MyExam.MyClinic

    set MyExam.MyClinic2;

    run;

    24: Is it necessary to specify the BUFNO= option every time a data set is processed?

    Answer:

    It is mandatory to specify the BUFNO= option every time a data set is processed. This is required since the buffer varies every time a data set is opened and closed. Moreover, the BUFNO= value set is valid only while a data set is open in the current session.

    25: What are the general guidelines for specifying the buffer size and buffer number in the case of small data sets?

    Answer:

    The main objective behind specifying the buffer size and buffer number is to reduce the number of I/O operations. In the case of small data sets, care must always be taken to allocate as many buffers as there are pages in the data set. This ensures that the entire data set can be loaded into the memory using a single I/O operation.

    26: How does the BUFSIZE= and BUFNO= impact the following program?

    data exam.clinic1 (bufsize=12288 bufno=10);

    set exam.clinic2;

    run;

    Answer:

    The above program reads the data set exam.clinic2 and creates exam.clinic1. The BUFSIZE= option specifies that exam.clinic1 is created with a buffer size of 12288 bytes. The BUFNO= option specifies that 10 pages of data are loaded into memory with each I/O transfer.

    27: Explain the SASFILE statement.

    Answer:

    The SASFILE statement loads the SAS data file into the memory to be available further to the program. With SASFILE you can free the buffers. Instead, the file is loaded and kept in the system memory with a pointer in the program to access it.

    The following example explains the use of SASFILE in a simple way. The SASFILE statement opens the data set MyExam.MyClinic and allocates the buffer. It reads the file and loads it into the memory so that it is available to both the PROC PRINT as well as the PROC MEANS step. Finally, the SASFILE data file is closed with the CLOSE statement and the buffer is cleared.

    sasfile MyExam.MyClinic load;

    proc print data= MyExam.MyClinic

    var. Serial No result;

    run;

    proc means data= MyExam.MyClinic;

    run;

    sasfile MyExam.MyClinic close;

    28: What happens if the size of file in the memory increases during the execution of SASFILE statement?

    Answer:

    When the SASFILE statement is executed, SAS assigns some buffer to the data file based on the number of pages to be loaded and the size of the index file. Once this is done, the file data is loaded into the memory for updates. The buffer size is automatically increased as the file size to be saved increases. The initial buffer memory size allocated is only the minimum memory allocated to load the file. It automatically increases provided there is ample memory left in the current operating system.

    29: Mention the guidelines to be followed while using SASFILE statement.

    Answer:

    While using the SASFILE statement, the following procedures are to be followed:

    a) There should be sufficient real memory to load the file.

    b) In case, there is not enough memory to load the entire file into one SAS data set, the DATA step should be used to create a subset of the file which will fit into the available memory. Since one part of the file is already loaded into the memory, the rest of the file data can also be easily accessed by the program. This reduces the CPU time significantly.

    30: When is the buffer allocated by the SASFILE statement freed?

    Answer:

    The buffer allocated by the SASFILE statement to load the data file is freed in two instances:

    a) When the SASFILE CLOSE statement is executed, the file is closed, and the buffer allocated for the data file is closed.

    Example: In the following program the SASFILE statement opens the data set MyExam.MyClinic and allocates the buffer. It reads the data into the memory which is available through the PROC PRINT and PROC MEANS steps. The last SASFILE statement closes the SAS data file and frees the buffer allocated for the file.

    sasfile MyExam.MyClinic load;

    proc print data= MyExam.MyClinic

    var Serial No result;

    run;

    proc means data= MyExam.MyClinic;

    run;

    sasfile MyExam.MyClinic close;

    b) The SASFILE buffer is allocated only as long as the session is open. When SAS session ends, it frees the buffer and closes the data file.

    31: Which operations are not allowed in a file opened with SASFILE statement?

    Answer:

    There are certain operations that cannot be performed on a file opened with SASFILE statement, such as replacing the file and renaming the variables.

    32: How do you calculate the total number of bytes occupied by a data file if you know the page size?

    Answer:

    The total number of bytes that a data file occupies can be calculated by multiplying the page size by the number of pages.

    Example: If the data file exam.clinic1 has a page size of 8192 and number of pages is 900, then the data file occupies 7372800 bytes (8192 * 9423).

    *****

    Data Storage Space

    33: What compresses the data storage space required to store a data set?

    Answer:

    SAS programs comprise of many temporary data sets which hold information during the runtime. You can choose to hold the data permanently in one or more data sets depending upon the available space and program requirements. Ideally, you can save the space for data sets by reducing the number and size of data sets and by cleaning up the storage space of everything unnecessary. SAS uses compression algorithms to reduce the size of the data sets. The COMPRESS= YES or Binary option is used to compress the data set. COMPRESS= YES is used with data sets that primarily contain character data. COMPRESS= Binary is used with data sets that primarily contain numeric data. The REUSE= YES is used when you want to reuse the space after compression.

    34: How does the WHERE statement help in reducing data storage space?

    Answer:

    The WHERE statement lets you remove all unnecessary observations or records being fetched into the data set. When using the WHERE statement, only those records that satisfy the WHERE condition will be fetched by the data set. So, it helps to filter the data being fetched thereby reducing the data storage space.

    35: How do you clean up the storage space?

    Answer:

    You can clean up the storage space by using the DATASETS or DELETE procedures. While using the PROC DATASETS method, you have to mention the library and then the data set to delete. When using the PROC DELETE method, you have to mention the exact data to be deleted. The more popular method is to use the PROC DATASETS option. This makes sure that the temporary file created to hold the data is deleted as soon as it is not required.

    36: Explain COMPRESS= System option.

    Answer:

    The COMPRESS= System option is used to compress all data files created during a particular session. It is used as COMPRESS= NO/YES/BINARY/CHAR. By default, Compress is set to NO which means no compression. When you set it to YES or CHAR, using the RLE algorithm, the trailing blanks and zeros are trimmed off. It basically compresses the character data. The BINARY option used with COMPRESS runs a Ross Data Compression (RDC) which uses a combination of RLE and sliding-window compression wherein a dictionary of frequently used words or character patterns are stored. The dictionary assigns a number and replaces the phrase with that number on each occurrence. A map of these numbers and phrases are maintained separately. Thus, the main data set is compressed.

    37: I have a compressed data set. I want to add an observation to it. Will it allow me to add the new observation? If yes, where will it be added?

    Answer:

    Yes, you can add new observations to an already compressed data set. The new observation will be added to the end of the existing list. This is because the descriptor of the data set will rest after the last observation in the data set. If  any observation is deleted, it is not reused or tracked. Instead, the new observations are added at the end of the current data set.

    38: Explain POINTOBS= data set option.

    Answer:

    Typically, the data set is traversed sequentially. But the POINTOBS= option provides you direct access to a particular observation using the observation number. You can set whether direct access is allowed or not by using the Data (POINTOBS= YES/NO) option. By default, it is YES and hence, observations can be accessed directly. This option is available only if you have compressed the data set using the COMPRESS= YES/CHAR/BINARY options.

    39: Explain LENGTH, ATTRIB, KEEP and DROP statements.

    Answer:

    The LENGTH, ATTRIB, KEEP and DROP statements are used to compress the data stored in variables. Both LENGTH and ATTRIB statements can be used to limit the size of the variable. ATTRIB can also be used to format the variable. In case, the length of the variable is not adequate, the data may be truncated. KEEP specifies that certain variables need to be kept in the memory until they are explicitly dropped. DROP specifies that the variables need to be dropped or deleted since they may not be accessed again.

    40: What factors are considered by SAS when calculating the data storage space required for a SAS data file?

    Answer:

    The following factors are considered by SAS when calculating the data storage space required for a SAS data file:

    a) Storage space required by the descriptor portion

    b) Storage space required by the observations

    c) Any storage overhead

    d) Storage space required for associated indexes

    41: How does a SAS character variable store data and what is the default length of a character variable?

    Answer:

    SAS character variables store data as one character per byte. The default length of a character variable is 8 bytes.

    42: Which step can be taken to reduce the length of a character variable?

    Answer:

    A LENGTH statement can be used to control the length of character variable.

    Example: In the following program the data set exam.clinic1 is created from the data set exam.clinic2. The variable, name, is assigned a value of 5. So, the variable name of the data set exam.clinic1 will have a length of 5.

    data exam.clinic1;

    length name $ 5;

    set exam.clinic2;

    run;

    43: How does SAS store numeric values and what is the default length of a numeric variable?

    Answer:

    SAS stores numeric values using double precision floating point representation (form of scientific notation). This helps with storing numbers of large magnitude and to perform computations that require precision after the decimal point. The default length of numeric variables is 8 bytes.

    44: Explain the significance of PROC COMPARE.

    Answer:

    PROC COMPARE is used to compare the contents of two SAS data sets. It compares the following:

    a) Data set attributes

    b) Variables

    c) Observations

    d) Variable attributes and values of matching variables

    Example: The following PROC COMPARE step compares the two data sets exam.result1 and exam.result2 and prints the result in SAS log:

    proc compare base= exam.result1

    compare= exam.result2;

    run;

    45: What all conditions make a data file an ideal candidate for compression?

    Answer:

    A data file becomes an ideal candidate for compression if it satisfies one or more of the following conditions:

    a) It is large

    b) It has many missing values

    c) It has many lengthy character values

    d) It has repeated characters or binary zeroes

    e) It has repeated values in the variable which are physically stored next to one another

    46: Explain the compression of a data set.

    Answer:

    A SAS data file by default is uncompressed. It can be compressed to conserve disk space. A data set can be compressed by using the COMPRESS= option.

    Example: The following program creates a compressed data set exam.result1 from the data set exam.result2. When the data set is created SAS writes a note to the log indicating the percentage of reduction in size obtained by compressing the data set. Here it uses the RLE (Run Length Encoding) algorithm for compressing the data set. RLE algorithm compresses the observations by reducing the repeated consecutive characters to 2-byte or 3-byte representations.

    data exam.result1 (compress= yes);

    set exam.result2;

    run;

    47: Which option is used for accessing an observation directly in an uncompressed data set?

    Answer:

    The POINT= option can be used for accessing an observation directly in an

    Enjoying the preview?
    Page 1 of 1