I'm trying to trace the source of some suspicious PDF editors that have been popping up lately. Everyone I ask claims they haven't done anything wrong, which is frustrating. To get to the bottom of this, I plan to gather web request logs from their devices for a thorough comparison of what everyone has in common.
While I know I can use PowerShell and its object comparison features, I feel like it might take me too long since I've only made a few scripts for work. There's a Python script option too, but there's a learning curve involved. I've seen some results for finding differences between files, but not much on identifying matching lines across multiple documents. If anyone has pre-made PowerShell scripts or knows of user-friendly software that can help with this kind of mass comparison, I'd really appreciate it!
6 Answers
Scripting sounds like the best route here, but if you're looking for alternatives, you might also consider using Excel. You could merge all your data into one big sheet and filter out duplicates that way, though it might get a bit messy!
If this isn't a one-time deal, definitely consider scripting it. For a beginner, try anonymizing some sample data and use an LLM to help you generate PowerShell or Python code to find what you need. PowerShell has built-in commands for handling CSVs and text files, and it's not too tough once you grasp the compare logic.
If you're on Unix/Linux/macOS, you can use this command: `grep -Fxf file.*` to find common lines. On Windows, you might try: `Findstr /i /x /g:file.*`. These commands can help you quickly identify matches across your files.
I’d probably lean towards using tools like Notepad++ combined with BeyondCompare for this task. Those could help visually spot common lines between your files easily.
For ongoing analysis, tools like VSCode's search within a folder could prove useful. Longer term, platforms like Splunk or ELK could work well for log aggregation and indexed search, although they do require a significant investment upfront.
You can read each CSV file and store the important lines in a hash table. Before you add a new string, check if it exists; if so, increment its count. Basically, you'll want to keep track of how many times each string appears. This way, after processing all the files, you can spot which are common! A tool like Co-Pilot could help here too, just make sure to be specific with your requests.
Related Questions
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically
[Centos] Delete All Files And Folders That Contain a String