SAS Certified Big Data Professional

The SAS Certified Data Scientist program can deepen your knowledge, jump-start your career and boost your earning power. The data science certification program includes five certification exams. To earn the SAS Certified Data Scientist credential, you must pass all five exams:
  • SAS Big Data Preparation, Statistics and Visual Exploration
  • SAS Big Data Programming and Loading
  • SAS Certified Specialist: Machine Learning Using SAS Viya 3.4
  • Natural Language & Computer Vision Specialist
  • SAS Certified Specialist: Forecasting and Optimization Using SAS Viya 3.4

Big Data Challenges and Analysis-Driven Data

Topics Covered
  • Reading external data files
  • Storing and processing data
  • Combining Hadoop and SAS
  • Recognizing and overcoming big data challenges

Exploring Data with SAS Visual Analytics

Topics Covered
  • Finding previously unknown relationships and spotting trends in your data.
  • Visualizing data using charts, plots and tables.
  • Using the auto charting function to visualize data in the best possible way
  • Using advanced graphs, such as network diagrams, San key diagrams and word clouds
  • Easily adding analytics to your graphs, and including descriptions of the analytics results
  • Navigating through your data using on-the-fly hierarchies

Preparing Data for Analysis and Reporting

Topics Covered
  • Creating and reviewing data explorations and data profiles
  • Creating data jobs for data improvement.
  • Establishing monitoring aspects for the data
  • Understanding the QKB components
  • Using the component editors.
  • Understanding various definition types

Introduction to ANOVA, Regression and Logistic Regression

Topics Covered
  • Generating descriptive statistics and exploring data with graphs
  • Performing analysis of variance and applying multiple comparison techniques
  • Performing linear regression and assessing the assumptions
  • Using regression model selection techniques to aid in the choice of predictor variables in multiple regression
  • Using diagnostic statistics to assess statistical assumptions and identify potential outliers in multiple regression
  • Using chi-square statistics to detect associations among categorical variables
  • Fitting a multiple logistic regression model
  • Scoring new data using developed model

Introduction to SAS and Hadoop Essentials

Topics Covered
  • Accessing Hadoop distributions using the LIBNAME statement and the SQL pass-through facility
  • Using options and efficiency techniques for optimizing data access performance
  • Joining data using the SQL procedure and the DATA step
  • Reading and writing Hadoop files with the FILENAME statement
  • Executing and using Hadoop commands with PROC HADOOP
  • Using Base SAS procedures with Hadoop.

DS2 Programming Essentials with Hadoop

Topics Covered
  • Identifying the similarities and differences between the SAS DATA step and the DS2 DATA step.
  • Converting a Base SAS DATA step to DS2.
  • Creating DS2 variable declarations, expressions and methods for data conversion, manipulation and conditional processing.
  • Creating user-defined and predefined packages to store, share and execute DS2 methods.
  • Creating and executing DS2 threads for parallel processing.
  • Leveraging the SAS In-Database Code Accelerator to execute DS2 code outside of a SAS session.
  • Executing DS2 code in the SAS High-Performance Analytics grid using the HPDS2 procedure.

Big Data Analysis with Hive and Pig

Topics Covered
  • Using Hive to design a data warehouse in Hadoop
  • Performing data analysis using HiveQL
  • Organizing data in Hadoop by usage
  • Performing analysis on unstructured data using Pig
  • Joining massive data sets using Pig
  • Using user-defined functions (UDFs)
  • Analyzing big data in Hadoop using Hive and Pig

Getting Started with SAS In-Memory Statistics

Topics Covered
  • Processing in-memory tables with PROC LASR and PROC IMSTAT
  • Accessing data more efficiently via intelligent partitioning
  • Creating filters and joins on in-memory data
  • Exporting ODS result tables for client-side graphic development
  • Producing descriptive statistics including counts, percentiles and means
  • Creating multidimensional summaries including cross-tabulations and contingency tables
  • Deriving kernel density estimates using normal functions
Enquire Now – Smartree