{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Quadrat-based statistics in `pointpats`\n",
    "\n",
    "This notebook demonstrates quadrat-based methods for analyzing planar point\n",
    "patterns using the current `pointpats.quadrat_statistics` implementation.\n",
    "\n",
    "The workflow is:\n",
    "\n",
    "1. Represent the point pattern as a NumPy array of coordinates with shape `(n, 2)`.\n",
    "2. Construct a rectangular or hexagonal quadrat tessellation over the **minimum\n",
    "   bounding box (MBB)** of the points.\n",
    "3. Compute a Pearson chi-square statistic from per-quadrat counts.\n",
    "4. Optionally, obtain a **Monte Carlo p-value** by simulating CSR realizations\n",
    "   within the same study window, with reproducibility controlled by `rng`.\n",
    "\n",
    "Key implementation facts (relevant for interpretation):\n",
    "\n",
    "- The default chi-square statistic uses **equal expected counts** across included cells.\n",
    "- A `window` may be supplied; if omitted, the MBB rectangle is used.\n",
    "- Monte Carlo inference makes the reference distribution window-aware, but it does not\n",
    "  change how expected counts are defined.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Quadrat statistics and CSR\n",
    "\n",
    "Quadrat analysis partitions space into a set of non-overlapping cells and compares\n",
    "the observed counts per cell to what is expected under **complete spatial randomness (CSR)**.\n",
    "\n",
    "Let $O_i$ be the count in cell \\(i\\), and let \\(k\\) be the number of cells\n",
    "included in the analysis. The implemented test statistic is the Pearson chi-square\n",
    "statistic with **equal expected counts** across included cells:\n",
    "\n",
    "$$\n",
    "X^2 = \\sum_{i=1}^k \\frac{(O_i - \\bar{O})^2}{\\bar{O}},\n",
    "\\qquad\n",
    "\\bar{O} = \\frac{1}{k} \\sum_{i=1}^k O_i.\n",
    "$$\n",
    "\n",
    "Analytical inference compares $X^2$  to a chi-square distribution with $(k-1)$\n",
    "degrees of freedom. Simulation-based inference (Monte Carlo) instead compares the\n",
    "observed $X^2$ to the sampling distribution obtained from CSR realizations\n",
    "generated in the same window.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Example: the juvenile delinquency point pattern\n",
    "\n",
    "We will illustrate the quadrat workflow using the `juvenile.shp` example dataset.\n",
    "The dataset is read as a GeoDataFrame and then converted to a NumPy coordinate array\n",
    "`xy` with shape `(n, 2)`.\n",
    "\n",
    "These coordinates are passed directly to `QStatistic`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import libpysal as ps\n",
    "import numpy as np\n",
    "import geopandas as gpd\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import `pointpats.quadrat_statistics` and compute quadrat statistics.\n",
    "\n",
    "The main entry point is:\n",
    "\n",
    "```python\n",
    "QStatistic(points, shape=\"rectangle\"|\"hexagon\", realizations=0, window=None,\n",
    "           drop_nonintersecting=False, rng=None)\n",
    "```\n",
    "\n",
    "- `points` is always a NumPy array of shape `(n, 2)`.\n",
    "- `shape` selects the tessellation type.\n",
    "- `realizations > 0` enables Monte Carlo inference.\n",
    "- `rng` controls reproducibility of simulations.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pointpats.quadrat_statistics import QStatistic"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Open the point shapefile and extract coordinates.\n",
    "\n",
    "Here we use `GeoDataFrame.get_coordinates()` to obtain an `(n, 2)` NumPy array,\n",
    "which is the required input type for `QStatistic`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "juv = gpd.read_file((ps.examples.get_path(\"juvenile.shp\")))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "juv.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xy = juv.get_coordinates()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resRect = QStatistic(xy)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resRect.chi2_pvalue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resRect.plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex = QStatistic(xy, shape='hexagon')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex.plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex.chi2_pvalue"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Simulation-based inference (Monte Carlo)\n",
    "\n",
    "When `realizations > 0`, CSR realizations are generated **within the same window**\n",
    "and the chi-square statistic is recomputed for each realization. The Monte Carlo\n",
    "p-value uses the standard +1 correction:\n",
    "\n",
    "$$\n",
    "p = \\frac{\\#\\{X^2_{sim} \\ge X^2_{obs}\\} + 1}{R + 1},\n",
    "$$\n",
    "\n",
    "where $R$ is the number of realizations.\n",
    "\n",
    "### Reproducibility\n",
    "\n",
    "All simulation-based results can be made reproducible by supplying `rng`:\n",
    "\n",
    "- `None` (default): create a new `numpy.random.default_rng()`\n",
    "- integer seed\n",
    "- `numpy.random.Generator`\n",
    "- legacy `numpy.random.RandomState` (wrapped)\n",
    "\n",
    "The same generator is passed through the CSR simulation code, ensuring repeatable\n",
    "Monte Carlo p-values.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from numpy.random import default_rng"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rng = default_rng(42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex_sim = QStatistic(xy, shape='hexagon', realizations=99, rng=rng)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex_sim.chi2_r_pvalue\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex_sim.chi2_realizations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "resHex_sim.chi2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Irregular windows\n",
    "\n",
    "Quadrat grids are constructed over the MBB of the points. If the study window is\n",
    "irregular (e.g., a polygon), some grid cells may lie entirely outside the window.\n",
    "\n",
    "- If such outside cells are included, they contribute guaranteed zeros and can\n",
    "  distort the test.\n",
    "\n",
    "\n",
    "### How Monte Carlo interacts with this\n",
    "\n",
    "Comparing the observed chi-square statistic to simulated chi-square values is a\n",
    "valid Monte Carlo test **for the statistic being used**, because the same grid,\n",
    "window, and inclusion rule are applied to both observed and simulated patterns.\n",
    "\n",
    "Monte Carlo inference makes the *reference distribution* window-aware, but it does\n",
    "not convert the test into an area-proportional quadrat test when the window is irregular.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from shapely.geometry import Polygon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "window = Polygon([ [0, 0], [1, 0], [0, 1] ])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pointpats.random import poisson"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pp = poisson(window, 100/1, rng=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qs = QStatistic(pp, nx=5, ny=5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qs.plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qs.chi2_pvalue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qsw = QStatistic(pp, nx=5, ny=5, window=window, realizations=99, rng=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qsw.chi2_r_pvalue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qsw.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Recap\n",
    "\n",
    "In this notebook, we:\n",
    "\n",
    "- represented a point pattern as an `(n, 2)` NumPy coordinate array,\n",
    "- computed quadrat counts using rectangular and hexagonal tessellations over the MBB,\n",
    "- used a Pearson chi-square statistic with equal expected counts across included cells,\n",
    "- obtained either an analytical p-value or a Monte Carlo p-value from CSR realizations,\n",
    "- clarified the issues associated with irregular windows, and\n",
    "- ensured reproducibility of simulations via an explicit `rng`."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.2"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
