Audit Firms
SUCCESS – 03
Audit Firms
- Home
- /
												 
															
How we Catalyzed a Forensic Team To Identify High Risk Transactions
As part of a project, we analyzed large volumes of email data and documents to identify high-risk transactions that met specific criteria. To achieve this, we used text analytics, including sentiment analysis and Fin BERT analysis. Additionally, we implemented a proprietary machine learning model to flag unusual emails. We also extracted and analyzed bank statements to monitor the flow of money between multiple accounts and identify high-risk transactions.
Tools Used

Python
Extraction of Data Analysis and ML Model

Microsoft Azure based Cognitive Services
Advanced Text

Power Bi/Tableau
Visualization

Alteryx Designer
Export-Transform-Load(ETL)
RESULTS
                        Reduction in 
time spent 
reviewing emails
                        0%
                        Improvement 
in Productivity
                        0%
                        Reduction in 
time spent 
reviewing emails
                        0%
TRANSFORMATION PROCESS
- Data Collection
- Risk Matrix Development
- Training Dataset Creation
- Manual Evaluation & Model Retraining
- Data Cleaning & Transformation
- Results Presentation
- ML Model Impact
The Forensic team provided us with over 19TB of data which included 1000+ email PS1/OST data and files in the form of PDF, word, Excel, text, etc.
We first developed a conceptual risk matrix comprising of Sentiment Analysis, Key word identification and K-means clustering. Each output was given a specific rank. The rank allowed us to reach the highest risky emails and text documents.
I order to build the training data-set, we selected sample set of emails. These emails were jointly evaluated by the forensic team with their risk rating. This was used to train the Machine Learning to identify the risky emails from the remaining email data. We used the Multinomial Naive Bayes Classifier. The classifier provided an output which was also considered in the risk score.
The identified high risk emails texts were provided for evaluation manually. Based on the feedback from the forensic team, the model was retrained with the additional data set.
In addition to the above we used Microsoft Cognitive Services along with Alteryx to clean and transform the data.
The final results were presented in Tableau and the operational data was presented in Power Bi.
Post deployment of the ML model, there forensic team were able to identify more critical transactions.