Posts

Section : 4 : Design & Implement Data Storage

Image
 64: Building a Fact Table:  We are going to make use of the existing SQL server database that we have this database has the adventure works sample data.  We will be using SQL Server Management Studio to connect both to the my SQL Database SQL Datawarehouse . we are not connected to Azure sql Database and our Dedicate SQL pool -- which is the Datawarehouse. We are creating a View based out two tables in "SalesOrderDetails" & "SalesOrderHeader" and create a table out of it and the copy it to the Datawarehouse on Azure Synapse.

Azure Synapse : Working with External Tables

 -- Working with external Tables External Table can be used to read and write data in Hadoop , Azure Blob Storage, Azure Data Lake. 59: Loading Data using Polybase :  62 : To be continued : Designing a Data Warehouse

SQL With Python

 https://www.youtube.com/watch?v=zrNHkRgWzTI Sql queries using Pandas Data frames : https://www.youtube.com/watch?v=oPuVYSC_kpo

Azure Synapse

Image
Not a SQL Server or a Azure Synapse , you can make use of the Azure Synapse service. Earlier they only had the service for hosting a sql data warehouse and then they brought the service of Azure synapse.  Initially while using the Azure synapse was that you can host a sql data warehouse. Over the time they have introduced many services to Azure synapse itself . Now you can host a sql database using the sql option as part of the entire Eco-system as part of the Azure synapse . But at the same time you can also make use of Apache Spark when it comes to Analyzing your data . You can bring your data much more closer for your Analytical needs by bring your "Azure Data Lake" attaching it basically to your Azure synapse workspace . You can use the tools on the right hand side which can be used for visualization. You can actually ingest your data using data injection tools on the left. -- Creating an Azure Synapse workspace : The first thing about working with Azure Synapse is creati...

Queries

Image
 Learning to write Queries: select 1+1 select is as good as print and Go command ends a batch we will learn what a batch is

Data Engineer Skillsets

  Scala is a programming language that combines object-oriented and functional programming paradigms. It is designed to be concise, elegant, and interoperable with Java. Scala runs on the Java Virtual Machine (JVM), which makes it compatible with existing Java libraries and frameworks. Apache Spark, on the other hand, is an open-source distributed computing system that provides a fast and general-purpose cluster computing framework for big data processing. Spark is designed to be fast and flexible and supports various programming languages, including Scala, Java, Python, and R. Scala is one of the primary programming languages for Apache Spark. Many of Spark's core components and APIs are written in Scala, and Spark applications can be developed using Scala. The combination of Scala and Spark allows developers to leverage the expressive and concise syntax of Scala while taking advantage of Spark's distributed computing capabilities for processing large datasets. Some key points...

Study Materials - Udemy :

Study Materials - Udemy : Data Engineer Associate DP -203 :  https://www.udemy.com/course/data-engineering-on-microsoft-azure/learn/lecture/27327228?start=30#overview Azure Databricks, Spark for Data Engineer https://www.udemy.com/course/azure-databricks-spark-core-for-data-engineers/learn/lecture/27514570?start=0#overview Hadoop Big Data : https://www.udemy.com/course/the-ultimate-hands-on-hadoop-tame-your-big-data/learn/lecture/11863332?start=15#overview Python: https://www.udemy.com/course/complete-python-developer-zero-to-mastery/learn/lecture/22727561?start=75#overview T-SQL https://www.udemy.com/course/70-461-session-2-querying-microsoft-sql-server-2012/learn/lecture/11725694#overview Apache Spark with Scala https://www.udemy.com/course/apache-spark-with-scala-hands-on-with-big-data/learn/lecture/11863448#overview