Start having visibility in service accounts with defender for identity

Defender for Identity is a cloud-based security solution that leverages On-Premises Active Directory signals to identify and detect threats. It monitors Domain Controllers by capturing its network traffic to leverage it with Windows event logs to analyse data for attacks that might occur on a network.

Once the sensor of Defender for Identity has been installed on all the Domain Controllers. We can use the data telemetry that it provides, to query for information.

In order to query the data that is generated by MDI, we need to have an E5 license, because this includes Defender for Identity.

This blog post will cover the following questions:

‘Do we still use this (domain) service account?’

‘Where do my service accounts logon to’?

‘How can we baseline Kerberos TGS requests?’

We are not going to talk about fancy attacks targeting AD, etc. However, what we are going to do is, cleanup service accounts in AD and start having visibility in them.

What is a service account and why should we care?

A service account is an ‘non-human’ account that has been created to run a particular software or service to interact with the operating system.

Service accounts need to have specific rights to function properly, but this often leads to have admin rights on a system or vendors requiring Domain Admin privileges, and so on.

There are great security solutions with the likes of Group Managed Service Accounts, but in order to use it. There are a set of requirements, that the software or service needs to follow. However, not every software or service will support gMSA, so organizations will fall back to use regular domain accounts as a service account.

It is a nightmare, because in most cases. These service accounts never rotate their passwords and often have weak passwords as well. Another common thing is, that many organizations don’t have visibility in their service accounts and don’t know if they are still being used or not. This increases the attack surface for an organization, so what if we could reduce it by removing unused service accounts?

Start having visibility in service accounts

Advanced hunting is a query-based threat-hunting tool that lets you explore up to 30 days of raw data. This will help us to query for logon events of each individual (service) account.

To be able to use Advanced Hunting:

We are going to use the ‘IdentityLogonEvents‘ table in Advanced Hunting, because it contains information about authentication activities.

  • Use case: Cleanup unused service accounts

We have a couple of service accounts in Domain Admins, but we don’t know if all of them are still being used or not. The great thing about Advanced Hunting is, that we can use KQL to query for authentication activities related to the specified accounts.

In the following query, we are going to look for authentication activities of a few service accounts, let’s say the service accounts in Domain Admins. This query will look in the past 30 days and count the authentication requests, that have been made by the list of specified accounts. When we don’t see a specified service account in the results. It may be worth to take a look at it, because it might not been used anymore. Service accounts shouldn’t be a Domain Admin in the first place, but if you do have a couple of them. I highly recommend to start with cleaning those first.

Query:

let timeframe = 30d;
let srvc_list = dynamic(["svc_account1@contoso.com","svc_account2@contoso.com","svc_account3@contoso.com","svc_account4@contoso.com","svc_account5@contoso.com","svc_account6@contoso.com"]);
IdentityLogonEvents
| where Timestamp >= ago(timeframe)
| where AccountUpn in~ (srvc_list)
| summarize Count = count() by AccountName, DeviceName, Protocol

Results:

At the following results, we can see that it has count all the authentication requests of each specified service account that have made any request in the past 30 days. What you will recognize is that, there might be a few service account(s) that do have been specified in the query, but don’t show up in the results. Those accounts that haven’t showed up in the results are worth to investigate further, because there is a high chance that they might not been in use anymore.

  • Count authentication request of each day

We are now going to write a simple KQL query to count all the authentication request that have been made, per day for each associated service account. This will show us a pattern of all the logons a service account has made.

Query:

let timeframe = 30d;
let srvc_list = dynamic(["svc_account1@contoso.com","svc_account2@contoso.com","svc_account3@contoso.com","svc_account4@contoso.com","svc_account5@contoso.com","svc_account6@contoso.com"]);
IdentityLogonEvents
| where Timestamp >= ago(timeframe)
| where AccountUpn in~ (srvc_list)
| summarize Count = count() by bin(Timestamp, 24h), AccountName, DeviceName
| sort by Timestamp desc

Results:

In the results, it will now count all the authentication requests for each day with the associated service account that belongs to it. At the results, we can also see where the service account has logged on to.

Now we are going to visualize the results. I’m using Jupyter Notebooks to do this, but this can be done as well with the ‘render’ operator in Kusto.

Here we can see which service account made the most logons in the past 30 days for example. All the service accounts have been marked in red, where it will display everything in a visualized way.

  • Visualize where service accounts logon to

In this example, we are going to visualize where our service accounts logon to. As an example, I’m going to filter on a prefix that starts with ‘svc_sql’ or ‘srvcSQL’ for example. This is due to the naming convention of all my SQL service accounts, but in this query. I’m only looking for accounts that have made more than 50 authentication requests.

Query:

let timeframe = 30d;
IdentityLogonEvents
| where Timestamp >= ago(timeframe)
| where AccountUpn startswith 'srvcSQL'
    or AccountUpn startswith 'svc_sql'
| summarize Count = count() by AccountName, DeviceName
| where Count > 50

Results:

At the results it will now only display service accounts that have made more than 50 authentication requests. In this case, I’ve used Jupyter Notebooks to visualize the data in a form of a pie chart. In this example, there were 20 unique service accounts with the prefix ‘svc_sql’ that made over 50 authentication requests in the past 30 days. Including on which machine they have logged on to.

Recommendations

A few important recommendations to turn this blog post into a practical one. In this section, we are going to cover a few best practices on cleaning up inactive service accounts in Active Directory.

The first part of the query contains the following code:

let timeframe = 30d;
let srvc_list = dynamic(["svc_account1@contoso.com","svc_account2@contoso.com","svc_account3@contoso.com","svc_account4@contoso.com","svc_account5@contoso.com","svc_account6@contoso.com"]);

Replace ‘svc_account@contoso.com’ with the UPN’s of your service accounts. Start with a small set of service accounts that you want to investigate if they are active or not. I would recommend to start with the service accounts that are in Domain Admins. Once you have ran the query, you might recognize that not all of the specified accounts in the query are showing up in the results. Investigate those and contact the product owner of the account.

Looking for anomalies in Kerberos TGS requests

Defender for Identity tracks every Kerberos TGS request, which is equivalent to event ID ‘4769’ on a Domain Controller. In this section, we are going to look for anomalies in Kerberos TGS requests and baseline what’s normal.

Step 1.

The first part of the query will prepare the time-series data for each identity by using the ‘make-series’ operator.

Query:

let starttime = 14d;
let endtime = 1d;
let timeframe = 1h;
let TotalEventsThreshold = 3;
let TimeSeriesData = 
IdentityLogonEvents
| where Timestamp between (startofday(ago(starttime))..startofday(ago(endtime)))
| make-series PerHourCount=count() on Timestamp from startofday(ago(starttime)) to startofday(ago(endtime)) step timeframe by AccountName;
TimeSeriesData

Results:

At the results, there will be columns such as, ‘PerHourCount’ and ‘Timestamp’. Where it will count all the service tickets, an identity has requested with the associated Timestamp that belongs to it.

Step 2.

The second step is to use the series_decompose_anomalies() function to detect for trends in our dataset, so we can flag for anomalies based on the provided augments that have been documented here:

Source: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/series-decompose-anomaliesfunction

Query:

let starttime = 14d;
let endtime = 1d;
let timeframe = 1h;
let TotalEventsThreshold = 3;
let TimeSeriesData = 
IdentityLogonEvents
| where Timestamp between (startofday(ago(starttime))..startofday(ago(endtime)))
| make-series PerHourCount=count() on Timestamp from startofday(ago(starttime)) to startofday(ago(endtime)) step timeframe by AccountName;
let TimeSeriesAlerts=TimeSeriesData
| extend (anomalies, score, baseline) = series_decompose_anomalies(PerHourCount, 1.5, -1, 'linefit')
| mv-expand PerHourCount to typeof(double), Timestamp to typeof(datetime), anomalies to typeof(double),score to typeof(double), baseline to typeof(long)
| where anomalies > 0
| project AccountName, Timestamp, PerHourCount, baseline, anomalies, score
| where PerHourCount > TotalEventsThreshold;
TimeSeriesAlerts

Results:

At the results, there will be a couple of columns. The ‘PerHourCount’ column, counts the total TGS requests an identity has made in a specific hour, while the ‘baseline’ column shows how many ticket request it was expecting.

Step 3.

The last part of our query will add additional fields to have more context. The important thing is to focus on gathering information on the amount of TGS request, a user has made in a specific hour, but also looking at how many TGS were requested on user accounts, instead of machine accounts in AD.

Query:

let starttime = 14d;
let endtime = 1d;
let timeframe = 1h;
let TotalEventsThreshold = 3;
let TimeSeriesData = 
IdentityLogonEvents
| where Timestamp between (startofday(ago(starttime))..startofday(ago(endtime)))
| make-series PerHourCount=count() on Timestamp from startofday(ago(starttime)) to startofday(ago(endtime)) step timeframe by AccountName;
let TimeSeriesAlerts=TimeSeriesData
| extend (anomalies, score, baseline) = series_decompose_anomalies(PerHourCount, 1.5, -1, 'linefit')
| mv-expand PerHourCount to typeof(double), Timestamp to typeof(datetime), anomalies to typeof(double),score to typeof(double), baseline to typeof(long)
| where anomalies > 0
| project AccountName, Timestamp, PerHourCount, baseline, anomalies, score
| where PerHourCount > TotalEventsThreshold;
TimeSeriesAlerts
| join (
IdentityLogonEvents
| where AdditionalFields has 'TARGET_OBJECT.USER'
| extend ParsedFields = parse_json(AdditionalFields)
| extend Spns = ParsedFields.Spns
| extend TargetAccountDisplayName = ParsedFields.TargetAccountDisplayName
| summarize UserSpnCount=count(),Spns=make_set(Spns), TargetAccountDisplayName=make_set(TargetAccountDisplayName) by AccountName, bin(Timestamp, 1h)
) on AccountName, Timestamp | extend AnomalyTimeattheHour = Timestamp
| where isnotempty(AccountName)
| where Spns !has 'krbtgt/'
| where TargetAccountDisplayName contains 'srvc'
| project AnomalyTimeattheHour, AccountName, TargetAccountDisplayName, PerHourCount, UserSpnCount, Spns, baseline, anomalies , score
| sort by AnomalyTimeattheHour desc

The following part in the query should be replaced with the naming convention of your service accounts:

  • Example:

Instead of ‘srvc’ it could be ‘svc’ for example.

| where TargetAccountDisplayName contains 'srvc'

Results:

At the final part of our query, there are a few columns that I think are worth to summarize. ‘TargetAccountDisplayName’ is a column that displays the service account a TGS was requested from. ‘PerHourCount’ column indicates the amount of TGS were requested in total on a specific hour.

However, the ‘UserSpnCount’ is a column that counts all the Kerberos tickets that were requested from a user accounts that has a SPN.

As we can see, in the first row. There were in total 9 ticket request an identity has made in a specific hour, but 8 of them were requested from user accounts with a SPN, instead of machine accounts.

The ‘score’ column tells how far the observed value is from the calculated baseline value. When looking in term of spikes, it is worth to look at the score. The higher the score is, the more chances it might be an anomaly.

Summary

In this blog post, we have covered a few simple use-cases by using the data telemetry of MDI. With the available data, we can discover all the authentication request of an identity, which helps us to see if an (service) account is still active or not, and where they logon. In the first part of the blog, there was a simple KQL query that specified a couple of service accounts. Once we ran the query and the results were displayed. We will see a couple of service accounts that have been specified in the query, showing up in the results.

However, there is always a chance a service account was specified in the query, but there were no results found of any form of authentication activity. If this is the case, it might be worth to investigate the account and potentially consider removing it.

The last part was using time-series analysis to baseline how many Kerberos tickets there are being requested by a user in a typical hour.

Reference

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s