Data lakehouse platform

2/27/2024

Lakehouse provides an enhanced multitasking experience to make your data management journey as efficient and user-friendly as possible with the following capabilities: No more juggling between different windows or losing track of your tasks. The multitasking experience provides a browser tab design that allows you to open and switch between multiple items seamlessly allowing you to manage your data lakehouse more efficiently than ever. Learn more about the different ways to load data into your lakehouse: Options to get data into the Fabric Lakehouse. Find more information on load data using dataflows: Create your first dataflow to get and transform data. Learn more about Spark jobs: What is an Apache Spark job definition?ĭataflows Gen 2: Data engineers can use Dataflows Gen 2 to ingest and prepare their data. Find more information on how to use the copy activity: How to copy data using copy activity.Īpache Spark job definitions: Data engineers can develop robust applications and orchestrate the execution of compiled Spark jobs in Java, Scala, and Python.

Pipelines: Data engineers can use data integration tools such as pipeline copy tool to pull data from other sources and land into the Lakehouse. You can learn more about how to use notebooks for Lakehouse: Explore the data in your lakehouse with a notebook and How to use a notebook to load data into your lakehouse. Notebooks: Data engineers can use the notebook to write code to read, transform and write directly to Lakehouse as tables and/or folders. Learn more about the explorer experience: Navigate the Fabric Lakehouse explorer. You can load data in your Lakehouse, explore data in the Lakehouse using the object explorer, set MIP labels & various other things. The Lakehouse explorer: The explorer is the main Lakehouse interaction page. Interacting with the Lakehouse itemĪ data engineer can interact with the lakehouse and the data within the lakehouse in several ways: (Currently the only supported format is Delta table.) You can then reference the file as a table and use SparkSQL syntax to interact with the data. You can drop a file into the managed area of the Lakehouse and the system automatically validates it for supported structured formats, and registers it into the metastore with the necessary metadata such as column names, formats, compression, and more. The automatic table discovery and registration is a feature of Lakehouse that provides a fully managed file to table experience for data engineers and data scientists. Automatic table discovery and registration If you don't see your table, you will need convert it to Delta format. Parquet, CSV, and other formats can not be queried using the SQL analytics endpoint. Only the tables in Delta format are available in the SQL analytics endpoint.

0 Comments

Data lakehouse platform

Leave a Reply.

Author

Archives

Categories