PTR logo

Technology

Integrating AI into Data Engineering and Business Intelligence

Is AI a genuine value-add for Data Engineering, or merely an expensive distraction that will have us firmly closing Pandora's box?

Motion graphic.
Integrating AI into Data Engineering and Business Intelligence

Integrating AI into Data Engineering and Business Intelligence (BI)

AI is an incredibly hot topic right now and is rapidly becoming a core component of business operations, large and small. But what does it actually mean for us as business users, and how can it realistically impact our daily working lives? Is it a genuine value-add, or merely an expensive distraction that will have us firmly closing Pandora's box? With that question in mind, I decided to find out for myself.

Data Exploration

As a data engineer, I regularly spend considerable time getting to grips with partners' data and planning how best to leverage it for business reporting. This requires painstaking analysis of large, varied datasets, and its complexity scales with the size and nature of the business involved.

It takes years of experience to recognise patterns in data that lead to the most effective and efficient BI models. So, the question is: can AI help streamline this process, without requiring the depth of expertise that a seasoned data professional brings?

AI Exploration

To find out, I built an AI Agent using Microsoft Foundry and connected it to my Microsoft Fabric workspace via notebooks. I created a Lakehouse and populated it with sample data from several raw tables, simulating that of a typical operational database. To make the test genuinely challenging, I introduced a range of common data quality issues: duplicate records, misformatted strings and integers, null values, and inconsistent whitespace.

With this sample data in place, I tasked the first AI model with exploring and analysing the dataset - identifying key table relationships, flagging problematic data and suggesting remediation steps. The initial results were imperfect, but they demonstrated that AI could attempt this process with impressive speed. With some refinement, the model produced a highly useful summary of the data structure, key relationships and quality issues. The accuracy significantly exceeded my initial expectations.

AI Transformation

With exploration complete, I turned to the next question: could AI take those findings and transform the data into a Silver layer following a typical Medallion Architecture? I tasked a second AI model with reviewing the issues identified in the first step and generating transformation queries to clean and restructure each table.

This stage proved considerably more challenging. It required further iteration and tuning to ensure the model handled the data correctly. Ultimately, it did produce clean Fact and Dimension tables, theoretically ready for use in a Semantic Model - though the journey there required meaningful human oversight throughout.

Is AI Usable in Data Engineering?

In short, yes - I believe AI can provide genuine, practical assistance in preparing and cleaning data and making it report-ready. However, its current limitations are equally apparent. It required significant engineering effort to prevent the model from producing inaccurate or misleading outputs. On that basis, I would not rely on AI alone to perform these tasks. Where it excels is in handling straightforward, well-defined data issues and flagging those it cannot resolve, handing them off to a human data engineer for review. In that context, I see it as a powerful force multiplier for the industry.

Could Anyone Use These Tools?

AI tools are widely accessible, and in theory anyone could use ChatGPT, Claude, Gemini, or similar platforms to attempt these tasks. However, it quickly became clear to me that my advantage in building this process lay in a deep understanding of the challenges inherent in data preparation - knowing what the model should look for and how to evaluate whether it had succeeded.

This raises an important question: how can you be confident that your data has been prepared correctly? Without genuine domain expertise, you would need to place complete trust in AI - and that trust may not be warranted. So, while anyone can use these tools, the more important question is: should they, and to what extent, before seeking the advice of an experienced professional? Reliable BI depends entirely on confidence in your data. Without that foundation, even the most sophisticated reporting infrastructure can produce misleading results.

The potential applications of AI appear almost limitless. It is a tool with the capacity to meaningfully improve efficiency and productivity across the industry. But it is not a tool to be used blindly. As the saying goes: "A tool is only as good as the hand that wields it."

Share This Post

Latest Articles

Frequently Asked Questions

Couldn’t find the answer you were looking for? Feel free to reach out to us! Our team of experts is here to help.

Contact Us