Skip to content

Enterprise Software Architect Reacts to Microsoft Fabric

This is enterprise software architect reaction to Microsoft Fabric. Enjoy!

Microsoft Fabric is a fascinating and extensive product. You need to understand concepts like Dataframes, Sparks, Data Flows, Pandas🐼, Lakehouse, OneLake, LakeOfTears etc. to be able to navigate in the world of Fabric. In the beginning, it may feel overwhelming, and you may not know where to start. Fortunately, Microsoft provides excellent documentation on Microsoft Learn + you can always spend tens of hours watching Indian guy explaining all to you. For example Radacad has some good tutorial videos about Fabric (I think they are not Indian thou)

Currently you can create Fabric environment with free trial and use it for 60 days. My current experience is, that the trial counter is broken, because I’m stuck at 59 days. I don’t know if it works for others like that or am I just a lucky bastard (thank you Microsoft, please don’t fix my counter).

My background is at software development, so I like to split things into methods, abstract things with interfaces, build unit tests and test my code all the time. However, in Fabric, you don’t do that.

No unit tests for you!

Testings relies a lot on manual testing. You simulate data, run pipelines manually and check the results. There are no tools to build XUnit or MSTests…

CI/CD

Well lack of automated testing platform is not a show stopper. Who has time to write unit tests anyway? However as a DevOps guy I won’t give up from CI/CD pipelines and automated deployments. Luckily, you can and actually you SHOULD use DevOps practices when working with Fabric (at least for some parts that is currently supported). In Fabric world Azure DevOps is the cool kid and GitHub is something, that Microsoft only whispers about. For example in this Microsoft Learn article the word “Azure DevOps” is mentioned 14 times and GitHub only twice!

The Code

What about the most beloved child? Can I somehow code that thing? Well yes you can! Fabric supports programming languages like Python, Scala, SQL and R. All the goodies which are totally new to enterprise architects. Where is Java, .NET and C++? What about Visual Basic and React? Ok the last one was bit too much. Product like Fabric cannot jump into every JS framework bandwagon. Still in Fabric you have to learn Python or Scala. You can do queries and little data manipulation with SQL, but for real data manipulation you will need to learn Python (PySpark) or Spark (Scala).

Coding in Fabric is based on Notebooks, that you can easily develope locally in Visual Studio Code and then copy-paste (yes you read that right) the code into Fabric. VS Code has some good extensions for PySpark and the development environment is quite easy to setup even into Windows.

However there are lot of features that doesn’t require (or even does not support) coding. You can define mappings with GUI tool, you can import data without touching SQL and you don’t have to make REST API calls to explore data. Fabric has tools for all of this. You don’t need Rust to access your data, you can just publish SQL endpoint by clicking mouse.

Summary

Fabric and data science, in general, are new worlds for programmers. You cannot use the tools you’re accustomed to, and you have to give up on some principles, like having a 100% unit test coverage. However, there is great promise in Fabric, which is to simplify big data handling. You don’t need to do everything by yourself anymore. There are better languages than C++ for parallel data handling. You just need to adapt to this new world and start learning new things again, just like you did when you began your programming career.