How can I find common strings in multiple text or CSV files?

0
0
Asked By CuriousCat42 On

I'm trying to trace the source of some suspicious PDF editors that have been popping up lately. Everyone I ask claims they haven't done anything wrong, which is frustrating. To get to the bottom of this, I plan to gather web request logs from their devices for a thorough comparison of what everyone has in common.

While I know I can use PowerShell and its object comparison features, I feel like it might take me too long since I've only made a few scripts for work. There's a Python script option too, but there's a learning curve involved. I've seen some results for finding differences between files, but not much on identifying matching lines across multiple documents. If anyone has pre-made PowerShell scripts or knows of user-friendly software that can help with this kind of mass comparison, I'd really appreciate it!

6 Answers

Answered By ExcelGenius99 On

Scripting sounds like the best route here, but if you're looking for alternatives, you might also consider using Excel. You could merge all your data into one big sheet and filter out duplicates that way, though it might get a bit messy!

Answered By PowerShellPro On

If this isn't a one-time deal, definitely consider scripting it. For a beginner, try anonymizing some sample data and use an LLM to help you generate PowerShell or Python code to find what you need. PowerShell has built-in commands for handling CSVs and text files, and it's not too tough once you grasp the compare logic.

Answered By TechWizard88 On

If you're on Unix/Linux/macOS, you can use this command: `grep -Fxf file.*` to find common lines. On Windows, you might try: `Findstr /i /x /g:file.*`. These commands can help you quickly identify matches across your files.

Answered By Codeslinger On

I’d probably lean towards using tools like Notepad++ combined with BeyondCompare for this task. Those could help visually spot common lines between your files easily.

Answered By LogAnalyzer101 On

For ongoing analysis, tools like VSCode's search within a folder could prove useful. Longer term, platforms like Splunk or ELK could work well for log aggregation and indexed search, although they do require a significant investment upfront.

Answered By PseudocodeGuru On

You can read each CSV file and store the important lines in a hash table. Before you add a new string, check if it exists; if so, increment its count. Basically, you'll want to keep track of how many times each string appears. This way, after processing all the files, you can spot which are common! A tool like Co-Pilot could help here too, just make sure to be specific with your requests.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.