CivilView Scraper
The CivilView Scraper is a Python-based automated data ingestion pipeline that scrapes sheriff sale listings from salesweb.civilview.com across 50 counties/parishes in 12+ states. It is the primary source of off-market foreclosure property data feeding into Dora's Deal Operating System, ultimately populating dispotree property records via the property-intel-db.
Overview
- Source:
https://salesweb.civilview.com/Sales/SaleDetails?PropertyId={id}(public CivilView portal) - Script location:
~/.hermes/scripts/civilview_scraper.py(Python) - SQLite mirror:
~/.hermes/civilview.db(intermediate cache) - PostgreSQL output:
property_inteldatabase atlocalhost:5434 - Total properties: 2,237 (as of latest scrape)
- Coverage: 50 counties across OH, NJ, LA, PA, FL, OR, IA, CO, IL, AZ, TX, WA, MN, plus others
Scraped Counties
| County | State | Scraper ID |
|--------|-------|-----------|
| Allen County | OH | 34 |
| Ascension Parish | LA | 55 |
| Atlantic County | NJ | 25 |
| Bergen County | NJ | 7 |
| Burlington County | NJ | 3 |
| Camden County | NJ | 1 |
| Canyon County | ID | 68 |
| Cape May County | NJ | 52 |
| Champaign County | IL | 42 |
| Cumberland County | NJ | 6 |
| Deschutes County | OR | 30 |
| Essex County | NJ | 2 |
| Gloucester County | NJ | 19 |
| Guadalupe County | TX | 86 |
| Hudson County | NJ | 10 |
| Hunterdon County | NJ | 32 |
| Josephine County | OR | 54 |
| Kent County | DE | 4 |
| Lake County | IL | 58 |
| Larimer County | CO | 48 |
| Lehigh County | PA | 51 |
| Lorain County | OH | 18 |
| Maricopa County | AZ | 47 |
| McLennan County | TX | 76 |
| Medina County | OH | 81 |
| Middlesex County | NJ | 73 |
| Monmouth County | NJ | 8 |
| Montgomery County | PA | 23 |
| Morris County | NJ | 9 |
| New Castle County | DE | 24 |
| Ocean County | NJ | 85 |
| Orleans Parish | LA | 28 |
| Palm Beach County | FL | 49 |
| Passaic County | NJ | 17 |
| Philadelphia County | PA | 60 |
| Pottawattamie County | IA | 11 |
| Pulaski County | AR | 74 |
| Richland County | SC | 61 |
| Rockwall County | TX | 69 |
| Salem County | NJ | 20 |
| Santa Rosa County | FL | 75 |
| Scott County | IA | 37 |
| Shawnee County | KS | 56 |
| Snohomish County | WA | 26 |
| Stearns County | MN | 62 |
| Story County | IA | 16 |
| Sussex County | DE | 12 |
| Union County | NJ | 15 |
Data Fields
Top-level columns (SQLite/PostgreSQL schema — ~45 columns)
The scraper normalizes CivilView data into a structured properties table:
| Field | Type | Description |
|-------|------|-------------|
| id | INTEGER | Primary key |
| source_id | INTEGER | Data source identifier |
| source_property_id | TEXT | CivilView PropertyId |
| source_county_id | TEXT | County scraper ID |
| county | TEXT | County name |
| state | TEXT | State abbreviation |
| sheriff_number | TEXT | Sheriff sale number |
| court_case_number | TEXT | Court docket number |
| parcel_number | TEXT | APN / parcel ID |
| plaintiff | TEXT | Plaintiff / lender name |
| defendant | TEXT | Defendant / borrower name(s) |
| address | TEXT | Property address |
| city | TEXT | City |
| zip_code | TEXT | ZIP code |
| property_type | TEXT | Property type (R, C, I, etc.) |
| upset_amount | REAL | Upset price / minimum bid |
| estimated_value | REAL | Appraised/estimated value |
| land_value | REAL | Land value |
| improvement_value | REAL | Improvement value |
| attorney | TEXT | Plaintiff attorney |
| attorney_phone | TEXT | Attorney phone |
| sales_date | TEXT | Sheriff sale date |
| last_sale_date | TEXT | Date of last sale |
| last_sale_price | REAL | Last sale price |
| days_on_market | INTEGER | Days since listing |
| current_status | TEXT | Status (Active, Scheduled, Sold, Postponed, etc.) |
| zoning | TEXT | Zoning code |
| zoning_description | TEXT | Zoning description |
| lot_size_sqft | INTEGER | Lot size in sq ft |
| buildable_area_sqft | INTEGER | Buildable area |
| max_far | REAL | Max floor-area ratio |
| max_units | INTEGER | Max dwelling units |
| flood_zone | INTEGER | Flood zone flag |
| wetlands_percent | REAL | Wetlands percentage |
| historic_district | INTEGER | Historic district flag |
| owner_name | TEXT | Owner name |
| owner_age | INTEGER | Owner age |
| years_owned | INTEGER | Years owned |
| owner_occupies | INTEGER | Owner-occupied flag |
| absentee_owner | INTEGER | Absentee owner flag |
| raw_data | TEXT (JSON) | Full CivilView raw data (JSON dict — see below) |
| url | TEXT | CivilView sale details URL |
| last_scraped | TEXT | ISO timestamp of last scrape |
| created_at | TEXT | Record creation timestamp |
| updated_at | TEXT | Record update timestamp |
raw_data JSON keys (dynamically scraped from CivilView per county)
The raw_data field stores the complete CivilView page data as a JSON dict. Across all counties, the following unique keys appear (some counties expose more fields than others):
| Key | Description | Example |
|-----|-------------|---------|
| Sheriff # | Sheriff sale number | F-26000141 |
| Court Case # | Court docket number | F-2811-25 |
| Plaintiff | Plaintiff/lender | Fundpality II, LLC |
| Defendant | Defendant/borrower(s) | Kerri Anne Rizzotto; ... |
| Attorney | Plaintiff's attorney | Gary C. Zeitz, L.L.C. |
| Address | Street address | 10 W Marvin Avenue Linwood NJ 08221 |
| Sales Date | Auction date/time | 4/30/2026 |
| Starting Bid | Upset amount | $47,000.00 |
| Approx. Judgment* | Judgment amount | $33,327.08 |
| Appraised Value | Appraised value | $120,000.00 |
| Parcel # | APN / parcel ID | 25-1910-19-006.000 |
| Property Note | Notes (occupancy, condition, etc.) | OCCUPANCY STATUS: OCCUPIED |
| Upset Amount | Upset amount (NJ counties) | $47,000.00 |
| Deed | Deed information | (varies) |
| Deed Address | Deed mailing address | (varies) |
| Priors | Prior lien/encumbrance info | OPEN TAXES QUARTERS... $5,049.06 |
| Attorney Phone | Attorney phone number | (varies) |
| Estimated Value | Estimated market value | $120,000.00 |
| Land Value | Land assessment value | (varies) |
| Improvement Value | Improvement assessment value | (varies) |
| Property Type | Property classification | (varies) |
| Zoning | Zoning code | (varies) |
| Zoning Description | Zoning description | (varies) |
| Lot Size SqFt | Lot size in sq ft | (varies) |
| Status History | Chronological status changes | Cancelled/Settled, Scheduled, ... |
Total: ~45 unique keys across the normalized columns + raw_data inner dict.
Schedule
The scraper runs twice daily on a cron schedule:
- 10:00 AM Eastern Time — morning scrape for fresh daily updates
- 8:00 PM Eastern Time — evening scrape to capture end-of-day changes
Each run iterates through all 50 counties, scraping each county's current sheriff sale listings from CivilView. The pipeline:
1. Runs civilview_scraper.py --county {id} per county
2. Outputs to ~/.hermes/civilview.db (SQLite intermediate)
3. Runs ingest_to_terminal.py to push into the terminal-server's terminal.db
4. Upserts into the property_intel PostgreSQL database at localhost:5434
Integration with DispoTree
The CivilView data flows into dispotree via the property-intel-db enrichment pipeline. Specifically:
contractOcrData(JSONB) — The Property model in DispoTree has acontractOcrDataJSONB field that stores the CivilView raw data (sheriff#, plaintiff, defendant, attorney, court case, status, property notes, etc.)auctionDate(DATE) — Maps from CivilViewsales_dateandSales DateparcelNumber(STRING) — Maps from CivilViewParcel #redemptionPeriodDays(INTEGER) — Derived from CivilView data where available
When Dora processes a deal in DispoTree, the contractOcrData field is used for:
- Deal scoring —
contractOcrData.isForeclosure,contractOcrData.existingMortgageBalancefeed intocompositeScoreandsellerMotivationScore - Hubzu publishing —
contractOcrData.parcelNumbermaps to the Hubzu tax ID field - Contract review —
contractOcrData.isSignedandcontractOcrData.isDaisyChaindrive compliance checks - Seller motivation analysis — Property notes and occupancy status from CivilView inform motivation scoring
Data Flow
``
salesweb.civilview.com
│
▼
civilview_scraper.py (Python, ~/.hermes/scripts/)
│
├──► ~/.hermes/civilview.db (SQLite cache)
│
├──► terminal-server/terminal.db (Sheriff Terminal UI)
│
└──► PostgreSQL: localhost:5434 / property_intel (2,237 properties)
│
▼
Dora Scoring Engine (13-factor scoring)
│
▼
DispoTree property.contractOcrData (JSONB)
│
▼
Deal creation, compliance checks, Hubzu publishing
`
Related Pages
- property-intel-db — The PostgreSQL database that stores scraped CivilView data
- dispotree — Consumes CivilView data via
contractOcrData` JSONB field