Here is a primer on how AI/ML algorithms can be used to monitor corporate correspondence for increasingly sophisticated (spear)phishing scams

Urgent messages to “click here” or “log in to view a statement” often pressure potential victims to inadvertently install malware or key-in their private information into a fake page.

As phishing schemes burgeon and evolve in sophistication, cybersecurity defenders are boosting vigilance with AL and ML algorithms to intercept attacks and manage cyber risks.

Zac Amos

Features Editor
zac@rehack.com

AI and ML algorithms can now be used to understand the methods and structural layouts of human correspondence to alert users and IT teams of potential danger or abnormalities.

By using cybersecurity solutions powered by predictive machine learning, any organization can strengthen its defense against malicious phishing attacks, spam and malware.

The four aspects of AI detection

By searching in four aspects of emails, AI systems can detect potential attacks and notify IT for attention. The four are structural anomalies; context and history; tone of voice; and organizational patterns.

Structural anomalies
AI detection engines seek to understand suspicious behavior at the base level. There is more to a phishing scheme than urgent subject messages and unusual phrasing. Instead, AI analyzes everything from the sender, recipient, subject line, data and attachments, location and time.

AI can even identify email spoofing, wherein malicious parties forge a familiar identity. Workers probably inherently trust a message appearing to originate from someone they are with, so they are more likely to reply and divulge sensitive information. However, the machine learning component of an AI detection system would identify these anomalies and report the danger before the employee could even react.

Next, the headers of legitimate emails contain unique tracking data or information about the sender. Phishing schemes often lack these headers and non-Gmail messages may utilize them — this structural misalignment would be a huge red flag for machine learning.

Finally, every message contains IP addresses that link them to their place and time in the world. A message with a United States IP address pings as suspicious when the user has usually logged in from Japan. Although hackers can forge IP chains, AI is still one step ahead. Noting the number of hops to arrive at the recipient can yield a clue — the usual email hops one or two times, but forged IPs would require triple that amount.
Context and history
Beyond analyzing the email structure, language and context are vital to the AI system’s decisions. Has someone from the sales department never directly messaged the accounting intern before? This may be a yellow flag for the algorithm. It marks other common phrasings and requests in phishing schemes such as:
- Requests for sensitive information like card numbers, passwords and company information
- A disconnect between the subject line and the content. These messages may be auto-generated, therefore coming across as strange or confusing
- Misspellings, odd phrasings and grammatical errors — possible signs of auto-generated content
- Call for urgency — hackers use this tactic to scare the user into downloading or entering information without thinking clearly
AI recognizes these common signs and works to differentiate between regular communications and schemes.
Tone of voice
Over time, machine learning will have detected a user’s unique tone and style to get a clear picture of ‘normal’ behavior. Is there a particular phrasing or sign-off that a worker uses often? What about the use of emojis or punctuation? Some people’s messages are rife with smiley faces and exclamation points, so a serious or sparse email would be cause for concern.

Natural language processing trains AI to understand human communication down to the structural level. Every person has a unique way of texting or speaking, so AI focuses on these patterns and stylings to understand what may be a message that is outside of typical behavior.
Organization patterns
Machine learning algorithms analyze behavior patterns from the entire organization to detect spear phishing. Instead of sending generic spam emails to thousands of people, attackers can spear-phishing — target only high-level individuals with access to critical data.

Detecting targeted and stylized attacks requires the algorithm to “learn” how people in the organization communicate: in terms of frequency, hierarchy, user profile, functional roles and identity.

To reduce false positives and alarms, the algorithm has to take into account the common occurrence of workers and outsiders communicating for the first time in emails. The more the AI understands the company’s infrastructure, the more it can recognize what is unusual.

Implementing phishing-detection AI

All organizations should include AI detection algorithms on every device, offline or on. From the CEO down to the office interns, anyone can fall victim to an attack, so cover every potential gap with the smart detection software.

IT teams must also monitor the data and findings of the algorithms. What patterns is it recognizing, and what does it need to learn? Is there a development in phishing tactics IT teams can teach?

The amount of necessary computational power needed to host such AI is also rapidly growing to include everything from specialized machine learning equipment to the natural language processing used to discern unique voice patterns in communications. This means organizations without the in-house resources may need to outsource or turn to cloud solutions to manage the growing volume of daily attacks.

Finally, implementing AI detection software does not mean employees should be unaware of phishing tactics. Organizations need to equip teams with phishing awareness and methods of response. For example, IT teams can show staff members examples of phishing emails caught by the AI. Seeing the direct threat gives them more immediate context.