I'm working on a simulation involving 100,000 entities over several time periods. Each entity has specific properties like revenue, expenditure, size, location, industry, and current cash levels. In each time period, the entities generate revenue and spend money. After each period, I want to check the current cash levels and see if any entity has gone into the red or is spending more than its income. I'm considering using matrix equations to handle all these entities simultaneously by storing parameters in matrices, but I'm worried about the performance difference compared to creating separate objects for each entity. Is there a significant performance gap between using matrices and entity objects in this type of simulation?
1 Answer
I'd recommend starting with an object-oriented approach first. When you’re testing, working with a smaller number of entities (like 20) will still give you meaningful insights. After your simulation works as expected, you can refactor to use numpy arrays for efficiency. While matrices will generally be faster, it’s important not to over-optimize at the beginning; focus on getting it right first and ensure you have good unit tests in place.

Thanks for the tip! So, when you mention switching to numpy arrays later, should I completely drop the object-oriented method, or can I integrate numpy somehow with it?