If you’re a data professional, you know that it’s important to set aside some time for training when a new release or paradigm comes from your platform. In the case of SQL Server 2019 (and later), you’ll want to pay close attention to the Big Data Clusters feature. It’s a exponential knowledge increase, and that’s no exaggeration.
There’s a lot to learn to implementSQL Server‘sBig Data Cluster system. I’ll be covering these topics at various workshops, events, courses, webinars and presentations around the world in more depth, and I thought I might show a few of the things the data professional needs to understand to get ready.
Some of these technologies and concepts are not owned or created by Microsoft the concepts are universal, and a few of the technologies are open-source. I’ve marked those in italics.
I’ve also included a few links to a training resource I’ve found to be useful. I normally use LinkedIn Learning for larger courses, along with EdX, DataCamp, and many other platforms for in-depth training. The links I have indicated here are by no means exhaustive, but they are free, and provide a good starting point.
Look for the training announcements I’ll post here on this blog to find out where our team is presenting these topics, and feel free to post comments on resources you have found useful.
Technology Descriptionlinux Operating system used in Containers and Container management (Kubernetes)
git Source control management system
Containers Encapsulation level for the SQL Server Big Data Cluster architecture
Kubernetes Management, control plane and security for Containers
Microsoft Azure Cloud environment for services
Azure Kubernetes Service (AKS) Kubernetes as a Service
Apache HDFS Scale-out storage subsystem
Apache Spark In-memory large-scale, scale-out data processing architecture used by SQL Server
python, R, Java, SparkML ML/AI programming languages used for Machine Learning and AI Model creation
Azure Data Studio Tooling for SQL Server, HDFS, Kubernetes cluster management, T-SQL, R, Python, and SparkML languages
SQL Server Machine Learning Services R, Python and Java extensions for SQL Server
Microsoft Data Science Process (TDSP) Project, Development, Control and Management framework
Monitoring and Management Dashboards, logs, API’s and other constructs to manage and monitor the solution
Security RBAC, Keys, Secrets, VNETs and Compliance for solutions
If that looks like a lot, it’s because it’s a lot. Stay tuned I’m with you on the journey. We’ll learn together.