Data Science

AI/ML, Data Science

Streamlining Invoice Processing with AI: How Accelerant InfoMiner Makes It Seamless

Invoice processing is at the heart of any business’s operations, but too often, it’s bogged down by inefficiencies. Manual data entry, mismatched formats, and approval delays can turn what should be a straightforward task into a time-consuming, error-prone bottleneck.At Accelerant, we built InfoMiner to change that. The Problem with Traditional Invoice Processing Many organizations still rely on manual or semi-automated workflows to manage invoices. This often leads to a range of issues: These inefficiencies don’t just impact productivity—they increase the risk of late payments, duplicate entries, and strained vendor relationships. Introducing InfoMiner: AI-Powered Invoice Automation InfoMiner is Accelerant’s intelligent automation platform that leverages AI and machine learning to revolutionize invoice processing. 🔍 Intelligent Data Extraction InfoMiner can read and interpret invoices in any format or layout, extracting key details such as: Unlike rule-based systems, InfoMiner learns and adapts to new formats over time—no rigid templates or manual configuration required. ⚙️ Real-Time Validation InfoMiner instantly validates extracted data against your internal business rules or ERP system. It automatically detects: This enables your team to resolve issues before approval, minimizing errors and delays. 🚀 Seamless Workflow Integration InfoMiner integrates effortlessly with your existing systems—ERP, accounting software, or workflow tools—so your team can work smarter using familiar platforms, while AI handles the heavy lifting. 📈 Speed, Accuracy, and Scalability With InfoMiner, you can expect: The Result: A Smarter, Faster Finance Function Whether you process hundreds or thousands of invoices per month, InfoMiner delivers: Ready to Modernize Your Invoice Processing? Say goodbye to spreadsheets, copy-paste tasks, and long approval delays.Say hello to AI-driven efficiency with InfoMiner. 👉 Learn more or schedule a live demo Accelerant is helping forward-thinking companies unlock the full potential of AI in finance and operations. Visit accelerant.com to explore what’s possible.

Data Science

Customer Experience

Software technology provider reduces technical support escalations by 15% The CustomerA software technology provider with global footprint. The Challenge The SolutionAccelerant partnered with the customer’s technical support department to predict hot technical supportcases. Customer was looking to create a process to manage those technical support cases based onactionable intelligence provided by Accelerant learning algorithm.Step 1. Accelerant created integrations to extract data from technical support portal, customer relationshipmanagement (CRM) system, customer satisfaction survey data and other web portals. Structured as well asunstructured data was used in this analysis.Step 2. Data was harmonized to create a timeline, and all support centric interactions with technical as wellas professional services departments.Step 3: Accelerant team designed analysis based on various features, which were then used in data sciencemodeling techniques.Step 4: Accelerant team created support ticket escalation score which continuously got updated based onnew interactions.Step 5: Support executive got alerted based on when a threshold score was breached. Solution createdexecutive summary for each support cases, which assisted executive to have quick meetings with theirteams. Result How can we assist you?Call us today at 949 391 4125 to know about our free assessment.

Data Science

AI Models – Go Local or Go Cloud – Pros and Cons

Businesses today are constantly facing new and bigger challenges and are being asked to do more for less. Technology is diversifying and evolving faster than ever to address these issues, giving industries across the board a growing number of solutions to sift through.     Local AI and cloud-based AI each have their own advantages and disadvantages, and the choice between them depends on specific use cases and requirements. Here are some reasons why someone in the field of software development, like you, might prefer local AI over cloud-based AI in certain situations:     1. Data Privacy and Security: If you are working with sensitive data, keeping AI algorithms and models on a local server or device can provide better control over data privacy and security. This is especially important in industries like healthcare, finance, and defense.   2. Low Latency: Local AI can provide lower latency since data processing and inference happen on the local hardware. This is crucial in applications where real-time responses are required, such as autonomous vehicles or robotics.   3. Offline Capabilities: In situations where an internet connection is unreliable or not available, local AI can still function, while cloud-based AI may depend on a continuous connection to the cloud.   4. Cost Management: Depending on usage, cloud-based AI services can become costly over time, especially if you have large volumes of data to process. Local AI can provide cost savings in the long run as you don’t pay for cloud resources.   5. Customization and Control: With local AI, you have more control over the algorithms and models you use, allowing for customization to meet specific needs. In contrast, cloud-based AI services may offer limited customization options.   6. Compliance: Certain industries and organizations have strict regulatory compliance requirements. Keeping AI local can simplify compliance with data handling regulations.   However, it’s important to note that local AI also has limitations. It may require substantial hardware resources, regular updates, and maintenance. Additionally, cloud-based AI has its own advantages, such as scalability, ease of deployment, and access to a wide range of pre-trained models and services.   In some cases, a hybrid approach combining both local and cloud-based AI can be the most effective solution, allowing you to leverage the benefits of both paradigms. Ultimately, the choice between local and cloud AI depends on your specific project goals and constraints.

Data Science

Converse with Data using NLP / LLM – The Holy Grail !

 Industry analysts have been predicting that “The future of BI is conversational.” for quite some time now. Giant leaps made recently by the various LLM models reinvigorated that quest. Yet production-grade conversational BI solutions are very hard to engineer. Business users are still looking for insights in BI dashboards and data analysts are still hand-writing SQL queries against their databases to answer ad-hoc business questions. Why is conversational BI still not here? The vast majority of enterprise data is still in structured data stores and accessible mainly through SQL queries. For any conversational BI solution there has to be an engine that will translate user’s natural language question into a valid SQL or a Panda’s dataframe formula. Engineers have tried to build “Natural Language to SQL” (NL2SQL) engines since the 70s (using rules-based techniques) which would very quickly get too complex to be useful. But with the advancement of transformers which have enabled tools like GitHub CoPilot, OpenAI Code Interpreter or Langchain’s  pandas_dataframe_agent or Langchain’s ‘SQL Agent’ it would seem this should be a trivial problem to solve. It is not. There are (at least) two ways a company can build an LLM-based NL2SQL engine to enable conversational BI 1. Fine-tuning your own LLM — This approach would require taking an existing LLM and then training it further using NL<>SQL pairs relating to the company’s structured data. A couple of challenges with this approach are that a) coming up with the training dataset is hard and expensive and b) the most powerful LLM model around (GPT-4) cannot be fine-tuned (as of this writing). 2.Leveraging In-context learning — The latest LLM models (like GPT-4–32K) can write SQL quite well out of the box and have enough context window for quite a bit of few shot training and for an agent to try to recover from errors by performing follow-ups using chain-of-thought techniques. The idea here is to build an LLM agent on top of GPT-4 that can implement NL2SQL with few shot learning. So what are the challenges of deploying solution #2? Here are six we have encountered: 1.Table and Column descriptions— Even the best data teams often do not have clear documentation about tables, columns and metadata. With the rise of ELT where data is simply dumped in the warehouse from various sources and transformed on query the situation becomes even worse. Therefore the table and column names might be the only info available to the engine at configuration time. 2.Missing Context and Metadata–- There are often business definitions which live in data analyst’s heads and are not in the underlying data. We encountered a real-world home rental marketplace, for which what constitutes an “active listing” is a combination of WHERE clauses which are different based on the value of another column which specifies the building type. In rare cases these are stored as Views on the table, but more often that not they are just stored in a query in the BI tool/dashboard. 3.Incomplete info in question, lack of “common sense” — “what was the average rent in Los Angeles in May 2023?” A reasonable human receiving this question would simply assume the question is about Los Angeles, CA or would confirm with the asker in a follow up. However an LLM usually translates this to select price from rent_prices where city=”Los Angeles” AND month=”05” AND year=”2023”which pulls up data for Los Angeles, CA and Los Angeles, TX without even getting columns to differentiate between the two 4.Speed — In order for the engine to be “conversational,” response times must be fast (sub 30s). This is often very hard to achieve, especially if the agent tried to recover from errors or evaluate generated responses with subsequent LLM calls. 5.Complex Queries – While GPT-4 writes simple SQL queries very well, it can often stumble on complex queries that require aggregations and joins. This is exacerbated in cases where the column name contains an action that can be done in SQL (for example Average or SUM)and in join operations on data warehouses where FOREIGN KEYS are not clearly enforced like they are in production DBs. 6.Privacy and Data Leaking – Many organizations do not want their database data or schema being sent to companies like OpenAI since it can leak into their training corpus. 7.Validation – There is no known way to identify cases where the system returns a syntactically valid but incorrect SQL. For example if the user asks for and ‘average’ value, and the system runs an AVG instead of picking a column called ‘average_price’ So is enterprise conversational BI impossible in 2023? Will there be a few more years of academic papers and company AI hackathon projects before a solution can be deployed in production? We don’t think so. While the challenges are definitely real, we believe with the right tool an enterprise data team can deploy solutions to enable business users to self-serve ad-hoc data questions from the company data warehouse. In the coming weeks we will be releasing a number of open source and hosted tools to address this.   

Data Science

Chat with Documents with Improved Response Accuracy

  Chatting with Documents using Natural Language using one of the most sought-after use cases of LLM by enterprises. A cute short-name for this problem is called RAG – Retrieval Augmented Generation. However, RAG lacks transparency in revealing what it retrieves, making it uncertain which questions the system will encounter. This results in valuable information getting lost amid a mass of irrelevant text, which is not ideal for a production-grade application.  Techniques for improving RAG performance:  After almost a year of building with LLM, I have learned many techniques to improve RAG performance and summarized some of my lessons in using RAG. In this section, I will go over few tested techniques to improve RAG performance. Adding additional info in the header or footer of the chunk Adding metadata in each chunk Adding summarized info in each chunk Use Langchain’s “Parent Document Retrieval” by using two sets of chunk sizes  Pre-retrieval Step:  Despite recent tremendous interest in utilizing NLP for wider range of real world applications, most NLP papers, tasks and pipelines assume raw, clean texts. However, many texts we encounter in the wild are not so clean, with many of them being visually structured documents (VSDs) such as PDFs. Conventional preprocessing tools for VSDs mainly focused on word segmentation and coarse layout analysis. PDFs are versatile, preserving the visual integrity of documents, but they often pose a significant challenge when it comes to extracting and manipulating their contents.  We all have heard of “garbage in, garbage out”. I think it also applies to RAG, but many people just ignore this step and focus on optimizing steps after this very crucial initial step. You cannot simply extract text from your documents and throw them into a vector database and assume to get realiable, accurate answers. Extraction of the texts and tables from the documents have to be semantically accurate and coherent.  Here is an example from my own experience. I had 10 resumes of different candidates. At the begining of the resume I got the name of the candidate. The rest of the pages (assume each resume is 2-page long) have no mention of the name.  In this case, chunks may lose information when split up each resume using some chunk size. One easy way to solve it is to add additional info (e.g. name of the candidate) in each chunk as header or as footer.  The second technique is chunk optimization. Based on what your downstream task is, you need to determine what the optimal length of the chunk is and how much overlap you want to have for each chunk. If your chunk is too small, it may not include all the information the LLM needs to answer the user’s query; if the chunk is too big, it may contain too much irrelevant information that reduces that vector search accuracy, and also confuses the LLM, and may be, sometimes, too big to fit into the context size.  From my own experience, you don’t have to stick to one chunk optimization method for all the steps in your pipeline. For example, if your pipeline involves both high-level tasks like summarization and low-level tasks like coding based on a function definition, you could try to use a bigger chunk size for summarization and then smaller chunks for coding reference. When your query is such that LLM needs to search lots of documents and then send a list of documents as answers, then it is better to useimilarity_search_wtih_score  search type. If your query requires LLM to perform a multi-step search to come an answer, you can use the prompt “Think step by step” to the LLM. This helps the engine to break down the query into multiple sub-queries After you retrieve the relevant chunks from your database, there are still some more techniques to improve the generation quality. You can use one or multiple of the following techniques based on the nature of your task and the format of your text chunks. If your task is more relevant to one specific chunk, one commonly used technique is reranking or scoring. As I’ve mentioned earlier, a high score in vector similarity search does not mean that it will always have the highest relevance. You should do a second round of reranking or scoring to pick out the text chunks that are actually useful for generating the answer. For reranking or scoring, you can ask the LLM to rank the relevance of the documents or you can use some other methods like keyword frequency or metadata matching to refine the selection before passing those documents to the LLM to generate a final answer.  Balancing quality and latencyThere are also some other tips that I found to be useful in improving and balancing generation quality and latency. In actual productions, your users may not have time to wait for the multi-step RAG process to finish, especially when there is a chain of LLM calls. The following choices may be helpful if you want to improve the latency of your RAG pipeline. The first is to use a smaller, faster model for some steps. You don’t necessarily need to use the most powerful model (which is often the slowest) for all the steps in the RAG process. For example, for some easy query rewriting, generation of hypothetical documents, or summarizing a text chunk, you can probably use a faster model (like a 7B or 13B local model). Some of these models may even be capable of generating a high-quality final output for the user.

Data Science

Enhancing Power BI Workspace Management and Data Integration

In today’s data-centric environment, efficient management of Power BI workspaces and seamless integration with diverse data sources are indispensable for organizations seeking to derive actionable insights. Leveraging automation through PowerShell scripts, augmented with core API calls, offers a robust solution to streamline these processes, reducing manual overhead and enhancing productivity. Let’s delve into how these automation techniques facilitate workspace management and data integration for Power BI users. Pre-requisites: Power BI Management Module: Ensure the Power BI Management module for Windows PowerShell is installed. If not installed, execute the following command in PowerShell to install it: Install-Module -Name Microsoft Power BI Mgmt -Scope CurrentUser Login Credentials: Power BI account credentials are required for authentication. Use the Login-PowerBI cmdlet to authenticate before executing the scripts. 1. Workspace Replication Managing numerous Power BI workspaces and ensuring uniform access across them can be arduous. Automation simplifies this task by replicating workspace configurations, encompassing dashboards, reports, and datasets, across multiple target workspaces seamlessly. Key API calls like Get-PowerBIWorkspace, New-PowerBIWorkspace, Export-PowerBIReport and Import-PowerBIReport, and Invoke-PowerBIRestMethod empower effortless content replication like semantic models, power bi reports and rdl reports between workspaces. By automating this process, organizations can swiftly provision new workspaces, foster collaboration, and maintain consistent access permissions across teams. 2. Updating RDL Datasources Binding RDL datasources to the correct datasets is critical for maintaining data integrity and accuracy in reports. Automation scripts facilitate this task by programmatically updating RDL datasources to align them with the appropriate datasets. Leveraging API calls like Invoke-WebRequest for RDL datasource management, organizations can ensure seamless synchronization between reports and datasets, enhancing reporting accuracy and reliability. 3. Dataset Parameter Customization Adapting dataset parameters to evolving data requirements is pivotal for sustaining data relevance. Automation expedites this process by bulk updating dataset parameters across multiple datasets. Core API calls like Get-PowerBIDataset and Invoke-PowerBIRestMethod enable efficient parameter customization, ensuring data consistency and accuracy. By automating parameter updates, organizations can streamline data modeling processes and respond promptly to changing business needs. 4. Data Source Credential Management Securing access to external data sources is paramount for preserving data integrity. Automation scripts centralize data source credential management across datasets, ensuring robust authentication. Utilizing essential API calls such as Get-PowerBIDatasource and Invoke-PowerBIRestMethod, these scripts facilitate seamless update of data source credentials, bolstering data security and access control. With automation, organizations can mitigate risks associated with manual credential management and ensure uninterrupted data integration. 5. Error Handling and Feedback Effective error handling mechanisms are pivotal for detecting and resolving issues during automation processes. Automation scripts incorporate robust error handling functionalities to promptly identify and report errors. Employing try-catch blocks and Write-Host commands, these scripts provide informative feedback to users, enhancing transparency and visibility into automation workflows. By terminating execution on the first error, these scripts minimize downtime and expedite efficient troubleshooting.  Conclusion Automation serves as a cornerstone in optimizing Power BI workspace management and data integration processes, empowering organizations to maximize efficiency and productivity. By seamlessly integrating PowerShell scripts with core API calls, organizations can accelerate decision-making, fortify data security, and adapt swiftly to evolving business requirements. Embracing automation empowers organizations to unleash the full potential of Power BI, unlocking actionable insights from their data effortlessly.

Data Science

IFS Data Migration Using Excel Migration: web Add-In

About IFS Cloud IFS Cloud is a composable enterprise software application that choreographs people, customers, and assets to minimize insolubility, cost and risk factors as you plan, manage and optimize critical asset availability, endeavor for operational superiority and workforce productivity. IFS provides a large range of ERP, Service Management and Enterprise Asset Management capabilities. Organizations choose the capabilities they need from across the solution areas. Construct on a powerful, flexible common platform IFS Cloud combines deep industry and functional strength with intelligent autonomous abilities. First, let’s understand the fundamentals of IFS Data Migration concepts. IFS Data Migration   IFS Data Migration is a tool that loads data from external sources into the application core objects in IFS Cloud. This tool is typically used to transfer data from previous legacy systems when an IFS solution is first implemented, or when systems are consolidated. The prime benefit of IFS Data Migration is that it always loads data through the business logic APIs in IFS Cloud. Unlike other tools and resources that insert data straight into tables IFS Data Migration assures that business rules, validations, and integrity checks are never bypassed. Data Migration jobs are highly adaptive and multiple jobs can be rope together depending on your requirement. Data Migration is the process of transferring data from one location to another, one format to another, or one application to another. It is a tough and complex job due to the data gravity. IFS Data migration tool assists to migrate your data overcoming the intricacy of data gravity. It permits you to move data via a system application business logic guarantees that the transferred data is valid at the destination. The tool supports various types of Data Migrations such as File migration and Source Migration. In order to shift data using IFS Data Migration tool, the first step is, you need to create a Migration Job. Migration Jobs In order to begin migrating data adopting IFS Data Migration Tool, one should first create a Migration Job. it is compulsory to create a migration job for all migration types because it comprises of configuration details such as Source, Destination, Type, Metadata, Rules etc. Go to Solution Manager > Data Management > Data Migration > Migration Job form. Here you have a header with details common to all types of migrations and child tabs that enable according to the selected Procedure (Migration type).      Field Description Job Id Unique name to identify the migration job. Only characters, numbers and “_” are allowed for this. allowed for this. Description Description of the Job. Procedure Defines the behaviour/type of migration job. (File Migration, Source Migration). According to the selected procedure, detail tabs will be changed. Direction This is a read only field. The direction values will be changed according to the selected procedure. View Name Enter an IFS view name if you want the source migration job to automatically create a New__ method in the method list. This will populate the File Mapping, Source Mapping and Method List Tabs accordingly. Group ID Migration Jobs can be assigned to a pre-defined group for easy management. Note Text You should always use the note text column to describe the purpose of the job. These will be fetched when printing the documentation of a job. Last Info Displays the log of the last execution. Migration Types Migration jobs can be divided on the way they migrate data. Each type may have different procedures for migrating data. File Migration This type of migration jobs transfers data between the database and files. The data can be imported to a single database logical unit or to a data migration container table (IC Table). You can export the data from multiple logical units to a single file. Source MigrationThis type of migration job can be used to migrate data from one data source to another. The source can be a TABLE, VIEW, JOIN, DB LINK or UNION, while destination can be a single or multiple logical units in the database. Excel Migration This type of migration job can be used to migrate data between Microsoft Excel (using the IFS Excel Add-In) and the database. Data can be both queried from and saved into single or multiple logical units. Internal Replication This type of migration jobs can be used to replicate data within an IFS Application instance. Internal Replication can be configured to replicate data between Sites and Companies in the same IFS Applications instance. Export/Import Migration Definition Use this type when you need to export/import migration jobs between IFS Application instances. Now we talk about our main purpose of writing this blog and that this “IFS Data Migration Using Excel Migration Web Add-In”. So, let’s start data migration using Excel Migration Job step by step. Excel Migration A migration job with procedure type Excel Migration can be executed through MS Excel and an authorized end user could load / manipulate data in the specific logical units related to the given migration job. The IFS Data Migration Excel Add-In uses the Excel Migration procedure type and is an extension to the standard data migration functionality in IFS Cloud. Step #1 Creating a Excel Migration job Go to Solution Manager > Data Management > Data Migration > Migration Job form. In the Migration Job page, create a new job with the Procedure Name to EXCEL_MIGRATION. You must specify a ‘View Name’.    Source Mapping Here add/modify required columns for the migration job. Make sure Pos and Length are not set to 0. Enter a value for the Source Column if you want to display it as an input from the user in Excel. The Source Column names will be used when creating the IC table for migrating data. Note the following restrictions for the Source column: 1. Should not exceed 30 characters. 2. Should not have any duplicates. 3. Should not contain spaces. 4. Should start with a letter. 5. All mandatory fields of the logical unit(s) should have a Source Column value, unless there

Data Science

Power BI Dataflow: Streamline Data Transformation

  Power BI is widely used visualization tools in variety of industries today. Its capability to present data in some fantastic charts and visuals stands it out. Data in it can be imported from tons of source systems, whether it is in database in file or from SharePoint in no time. We can also manipulate the data after being loaded in the Power BI Desktop file but what if we want to use a table manipulated in Power Query, for a new report. In general we will have to manipulate that data separately for each report and it will be a tedious task to perform in repetitions.  Dataflow is the solution here. It is a collection of tables that are created in the Power BI service workspaces. With Dataflow, we can achieve data translation functionality in Power BI Service, it means all the operations that are available in Power BI desktop, can be performed in Power BI Service. We can run all Power Query operations from the cloud independently. In general, we load our data in Power BI desktop, build some visualization and then we publish the report to Power Bi Service. When we publish any report to Power BI Service, we can see the report and a dataset with the same name as of a report. The dataset is the file which is associated with the Power BI report. Dataflow is totally independent of the dataset of any report and the refresh is also performed independently in the cloud. In Power BI dataset we are storing the data In-Memory. Dataflow uses the process of Azure Data Lake Storage. Image reference: https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-introduction-self-service When to use Dataflow: 1. To avoid maintenance-cost As we have discussed earlier that if we want the same table in multiple reports, we must get the report from the source every time and if we perform any data transformation operation in the table and use it in any report and want that same operations on a table in other report then we have to do it again and again. Isn’t it too head-scratching? Yes! It is. Dataflow is something that provides us the centralized location to our workspace. Suppose, we create a dataflow with our desired tables and perform all the ETL operations in Power BI Service itself and then we call that Dataflow in our Power BI Desktop from our Workspace. So, every time there is some data update required in table that is used in multiple reports, we need to change only the Dataflow. So, the maintenance-work becomes easy. 2. Independent Table Refresh: Dataflow is independent from the report refresh. We can refresh specific tables with Dataflow. With Dataflow, we can refresh more transactional tables on hourly basis (or whatever is needed) and other tables on daily/weekly/monthly basis. So, it gives us that freedom from report refresh. How to Create a Dataflow: To create a dataflow we have to follow the steps mentioned below: Step1: Open the Power BI Service. Setp2: Go to the “Workspace” pane on the left-hand side of the screen and click it. Step3: After that click on “Create a workspace”. Step4: Give a suitable name the workspace and hit on ‘Save’. It will create a new Workspace in the Power BI Service. Step5: On the upper-left side of the Workspace window, click on the (+) New Icon. Select Dataflow from the list. It will open a new window as in the screenshot. Step6: There is option named ‘Define New Tables’. We can create a fresh Dataflow with the help of this option. To do that, Click on Add new Tables option in Define new tables. We get a list of options after clicking ‘Define new tables’ as we have Get Data in Power BI desktop. Step7: We can select data source as per our need. Let say, we select SQL Server Database as a source. After clicking it we get a new window with Connection Settings and Connection credentials. If we have on-premises SQL Server database then we need to install Data Gateway from the Power BI Service. Step8: After filling the credentials we can choose the desired tables from the database and click on Transform data.   Now we get a window similar to Power Query. Every operation that we are doing in this window is like the operation we do in Power BI desktop. We can perform all the ETL operations here. After all the operations, click on “Save and Close”. Step9: We need to provide suitable name of the Dataflow and click on “Save”. This will create our Dataflow. Note: Once the Dataflow is created, we need to refresh our data. If we forget to refresh the data and we use that Dataflow in some other Dataflow, then it will not show any data. Link entities from other dataflows: This option is useful in scenarios where we need to use the tables of an existing dataflow. We don’t need to add tables again and again. To reduce the duplication, we choose the option “Link tables from other dataflows” which leads us to use tables existed in an existing dataflow with other desired tables. To work with “Link tables from other dataflows”, we need to follow steps mentioned below:  To do that click on Workspace, then click on (+) New and choose Dataflow. Then from the options now click on “Link tables from other dataflows”. A window with Connection credentials will open. After filling all the fields, we will click on “Sign-in” and after that click on Next. By clicking Next a window with our workspaces having dataflows is shown. After selecting your desired tables from the dataflow, click on “Transform data”. To load other tables, click on “Get data” and click on the desired data source, i.e. Excel workbook, SQL Server database, Text CSV etc. Let’s get the table from SQL Server database.  When we click on SQL Server Database, we get a credential window.   After filling the credentials, we can access all the tables of

Data Science

Send a result set of a Snowflake query as email in Tabular format using ADF

Sending an email notification is the traditional way of checking if the integration has been done properly or not. In ADF for sending an email, we need to configure the logic app in the Azure portal. We can send several notifications on respective events like on Success, Failure or Completion of a pipeline. We can send a result set of a query as an attachment as well. But when we need to send a result set of a query in an email as a body, we don’t get the right format. Here I am explaining how can we show the result set with multiple records in tabular format in an email body using some simple steps. So, for this, we have a query whose result set needs to be sent by email. Here I am using Snowflake as a database but this query can be used in SQL Server too. Now to get it in a single record and give it an HTML table format, we need to modify it as: I have used <tr> and <td> tags in between to give it a tabular format and got it in a single record to process it in the Lookup activity in ADF in a single go else we need to use For each activity to process each record. Now if the logic app is already configured in the Azure portal, then create a pipeline in ADF and take a lookup activity as below and use the above query in its settings:   I have already configured a pipeline for Success Email Notification using Web Activity like the below:   Now add this pipeline to our pipeline as below: In the pipeline PL_SendEmailSuccess settings, write the code to the parameter which is mentioned as Body in LogicApp as below: Now we will execute the pipeline. Though the output of lookup looks messy as below:   But email we get is properly formatted in table form: Conclusion: This type of task can be used to send the status report of the jobs which run on a daily basis, to send an alert on significant metrics, or to get the result of ETL audits with rejected records. So instead of downloading attachments, users can have a quick check using this approach.

Activate Your Free 15-Day Infominer Trial

X
Scroll to Top