Data-Engineer

Posts

Azure Synapse : Working with External Tables

January 18, 2024

-- Working with external Tables External Table can be used to read and write data in Hadoop , Azure Blob Storage, Azure Data Lake. 59: Loading Data using Polybase : 62 : To be continued : Designing a Data Warehouse

SQL With Python

November 21, 2023

https://www.youtube.com/watch?v=zrNHkRgWzTI Sql queries using Pandas Data frames : https://www.youtube.com/watch?v=oPuVYSC_kpo

Azure Synapse

November 20, 2023

Not a SQL Server or a Azure Synapse , you can make use of the Azure Synapse service. Earlier they only had the service for hosting a sql data warehouse and then they brought the service of Azure synapse. Initially while using the Azure synapse was that you can host a sql data warehouse. Over the time they have introduced many services to Azure synapse itself . Now you can host a sql database using the sql option as part of the entire Eco-system as part of the Azure synapse . But at the same time you can also make use of Apache Spark when it comes to Analyzing your data . You can bring your data much more closer for your Analytical needs by bring your "Azure Data Lake" attaching it basically to your Azure synapse workspace . You can use the tools on the right hand side which can be used for visualization. You can actually ingest your data using data injection tools on the left. -- Creating an Azure Synapse workspace : The first thing about working with Azure Synapse is creati...

Queries

November 17, 2023

Learning to write Queries: select 1+1 select is as good as print and Go command ends a batch we will learn what a batch is

Data Engineer Skillsets

November 16, 2023

Scala is a programming language that combines object-oriented and functional programming paradigms. It is designed to be concise, elegant, and interoperable with Java. Scala runs on the Java Virtual Machine (JVM), which makes it compatible with existing Java libraries and frameworks. Apache Spark, on the other hand, is an open-source distributed computing system that provides a fast and general-purpose cluster computing framework for big data processing. Spark is designed to be fast and flexible and supports various programming languages, including Scala, Java, Python, and R. Scala is one of the primary programming languages for Apache Spark. Many of Spark's core components and APIs are written in Scala, and Spark applications can be developed using Scala. The combination of Scala and Spark allows developers to leverage the expressive and concise syntax of Scala while taking advantage of Spark's distributed computing capabilities for processing large datasets. Some key points...

Study Materials - Udemy :

November 16, 2023

Study Materials - Udemy : Data Engineer Associate DP -203 : https://www.udemy.com/course/data-engineering-on-microsoft-azure/learn/lecture/27327228?start=30#overview Azure Databricks, Spark for Data Engineer https://www.udemy.com/course/azure-databricks-spark-core-for-data-engineers/learn/lecture/27514570?start=0#overview Hadoop Big Data : https://www.udemy.com/course/the-ultimate-hands-on-hadoop-tame-your-big-data/learn/lecture/11863332?start=15#overview Python: https://www.udemy.com/course/complete-python-developer-zero-to-mastery/learn/lecture/22727561?start=75#overview T-SQL https://www.udemy.com/course/70-461-session-2-querying-microsoft-sql-server-2012/learn/lecture/11725694#overview Apache Spark with Scala https://www.udemy.com/course/apache-spark-with-scala-hands-on-with-big-data/learn/lecture/11863448#overview

Search This Blog

Data-Engineer

Posts

Section : 4 : Design & Implement Data Storage

Azure Synapse : Working with External Tables

SQL With Python

Azure Synapse

Queries

Data Engineer Skillsets

Study Materials - Udemy :