How to Train CUI Classifiers Without Actual Documents?

0
18
Asked By TechSavvyGuru42 On

I'm configuring DLP, labeling, and trainable classifiers at my workplace in a Microsoft GCCHIGH environment. The issue is I'm struggling to train the "CUI" classifiers effectively since we don't have enough actual CUI documents; it seems I need at least 50 positive and 50 negative samples. I've tried generating fake data, but that hasn't worked. Are there any sysadmins or Information Protection Engineers who have tackled this? What steps did you take to set up trainable classifiers without having actual CUI documents?

3 Answers

Answered By CuriousMind22 On

I don’t have a solid answer, but I'm definitely interested in hearing how this works for you! Just out of curiosity, why are you trying to set up these classifiers without the actual CUI? Is it because your company hasn't provided them, or just doesn't have enough? A few insights would really help!

CUIresearcher88 -

We’re aiming for CMMC level 2 certification. I’m using DLP to identify where our CUI documents are located. We do have some actual CUIs, but not the required 50 for training.

QuestionNinja07 -

Yeah, it's tough! For compliance, you might be in a bind without enough documents.

Answered By SupportiveTechie On

You might have some luck asking a contracting officer for sample documents if you're working on any active CUI contracts. They usually have samples that can be used for training. Just a heads-up, though, we've experienced mixed results with the classifiers for CUI info in the past.

TechSavvyGuru42 -

Good point! I’ve read that the classifiers can be hit or miss. I think I’ll try reaching out to the CO for samples.

Answered By DataGuardianPro On

Training classifiers for CUI is tricky due to the broad definitions, as NARA lists around 125 categories. Start by limiting SharePoint access to those who need it for CUI, set a default label, and prevent users from changing it. I recommend running auto-label policies in simulation mode first to gain insights on your data. A helpful resource is Summit7’s steps on Microsoft 365 for CMMC. I’m dealing with a similar issue for a client, and it’s definitely a challenge!

TechSavvyGuru42 -

I put the DLP in test mode to see where things stand. I have a restricted SharePoint setup for CUI, but to be honest, I’m concerned about my job since we're bringing in an MSP for a lot of this Microsoft work.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.