Skip to content

Document memory requirements for PolicyEngine-US #6429

@MaxGhenis

Description

@MaxGhenis

Summary

PolicyEngine-US lacks documentation about system memory requirements for different use cases. This makes it difficult for users to know what hardware they need, especially when choosing between configurations for household calculations vs full microsimulations.

Context

I've conducted empirical testing on macOS (Apple M4, 24GB RAM) to measure actual memory usage. Full analysis available in this notebook: https://gist.github.com/MaxGhenis/253efeb07f4bfa8b50af768accf73c9d

Measured Memory Usage

Based on testing with PolicyEngine-US v1.349.3:

  • Initial import: ~400MB (loading tax-benefit system)
  • Basic simulation creation: ~60MB
  • Household calculations: <100MB additional
  • Multiple simulations: Scales linearly (~60MB per additional instance)
  • CPS microsimulation: 2-4GB estimated (based on dataset size)

Proposed Documentation

I suggest adding a "System Requirements" section to the README with:

Minimum System Requirements

For Household Calculations Only

  • RAM: 8GB minimum, 16GB recommended
  • Memory usage: ~500MB for basic operations
  • Use cases: Individual household tax calculations, policy impact on specific families

For Microsimulation (CPS/ACS datasets)

  • RAM: 16GB minimum, 24GB recommended, 32GB optimal
  • Memory usage: 2-4GB depending on calculations
  • Use cases: Population-wide analysis, distributional impacts, revenue estimation

For Development and Heavy Usage

  • RAM: 32GB or more
  • Use cases: Running multiple simulations, comparing reforms, development with IDEs

Additional Considerations

When running alongside other applications, add:

  • IDE/Editor: 2-4GB
  • Browser with docs: 2-3GB
  • Jupyter notebooks: 500MB-1GB
  • Each additional PolicyEngine instance: ~60MB

Platform Notes

  • Apple Silicon (M1/M2/M3/M4): Very efficient due to unified memory architecture
  • Memory scales linearly with multiple simultaneous simulations
  • Memory is efficiently managed with garbage collection

Implementation

Happy to submit a PR adding this documentation to the README or a separate REQUIREMENTS.md file. Let me know your preference!

Testing Environment

  • macOS 15.2
  • Apple M4 processor
  • 24GB RAM
  • Python 3.10
  • PolicyEngine-US v1.349.3
  • numpy 1.26.4 (note: compatibility issues with numpy 2.x)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions