@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sub1: <https://w3id.org/sciencelive/np/RACSKwvXe5B441qTr7AuTzmT74g03VRuERc-31egcScZI/> .
@prefix sciencelive: <https://w3id.org/sciencelive/o/terms/> .

<https://w3id.org/sciencelive/np/RAnYD9w4jylurPK2GH4-YmKtiqyNOy8is8itzxuTgd3Qw/phillips-table2-maxent-reproduction>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Reproduction of Phillips et al. 2009 Table 2 — MaxEnt random vs target-group background AUC";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q122175981>,
    <http://www.wikidata.org/entity/Q2725298>, <http://www.wikidata.org/entity/Q327120>;
  sciencelive:hasDeviationDescription """(1) MaxEnt engine: the open-source elapid/maxnet implementation is used rather than Phillips' original Java Maxent, so exact AUC decimals are expected to differ even where direction and magnitude agree. 
(2) Only MaxEnt is run — Phillips' broader Table 2 also covered BRT, MARS, GAM and other methods, which are not reproduced here. 
(3) One species fewer is modelled than Phillips' 226 (225 here) because a species whose presence-absence evaluation column has no presence/absence variation gives an undefined AUC and is dropped. 
(4) A minimum-presence threshold of 5 occurrences is applied before a species is fit.""";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q7150>;
  sciencelive:hasMethodologyDescription "The Elith et al. 2006 NCEAS presence-only / presence-absence benchmark — the same data Phillips used — is obtained from the rspatial/disdat R data package (data paper doi 10.17161/bi.v15i2.13384) by downloading its .rds tables and reading them in Python with pyreadr. For each species across the six regions (AWT, CAN, NSW, NZ, SA, SWI), a MaxEnt model (elapid engine, linear + quadratic + hinge features) is fit twice: once against the region's random background sites supplied by disdat, and once against a target-group background formed from the pooled presence localities of all species in the same biological target group. Both models predict at the independent presence-absence evaluation sites and AUC is computed with scikit-learn. Per-species AUC for the two background types is aggregated to region, group and overall means, and the paired difference is tested with a Wilcoxon signed-rank test. 225 species across 6 regions are modelled.";
  sciencelive:hasScopeDescription "This study reproduces the Maxent row of Phillips et al. 2009 Table 2: the comparison of mean predictive AUC for presence-only species distribution models trained with random background versus target-group background. In scope: the direction and magnitude of the AUC gain from target-group background, aggregated across species, and the stronger gain in regions with greater sampling bias. Out of scope: the other modelling methods Phillips also tested (BRT, MARS, GAM and others) and the absolute predicted-distribution maps — only the MaxEnt AUC comparison on this benchmark is tested here.";
  sciencelive:targetsClaim <https://w3id.org/sciencelive/np/RAHF_1MUfAVbXhXvj_Wtq8GsP8ZWjc9LerDhdLqhv_SzE/target-group-background-improves-auc> .

<https://w3id.org/sciencelive/np/RAKYb38hHcOMN5D-X1o_1RKSmDUojUUTxcT4pjL_Pe24s/loveland-2024-rom-tradeoff-study>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Stat-level reproduction of the Loveland et al. (2024) coupled ADCIRC+SWAN reduced-order source-term trade-off";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q108300283>,
    <http://www.wikidata.org/entity/Q121742>, <http://www.wikidata.org/entity/Q2586158>,
    <http://www.wikidata.org/entity/Q392603>, <http://www.wikidata.org/entity/Q55602927>;
  sciencelive:hasDeviationDescription """1. No ADCIRC+SWAN model re-runs. Loveland's runs were on 1064 Intel Xeon Platinum 8280 cores (19 nodes of TACC Frontera, \"Cascade Lake\"), with unstructured meshes of 6,675,517 elements / 3,352,598 nodes for Ike and 3,102,441 elements / 1,593,485 nodes for Ida. This compute scale is out of scope for a laptop / GitHub Actions / Docker reproduction. The replication therefore transcribes Loveland's published model-side outputs rather than regenerating them. The trade-off ratios reported in the Outcome's Evidence field are derived from Loveland's Table 4; the WSE-RMSE values are from Tables 5-6; the wave-statistics RMSE values are from Tables 5-7 and the §5.2 prose.

2. DesignSafe deposit not retrievable. Loveland deposited their model inputs (meshes, fort.26 source-term configs, OWI Ike winds, HURDAT2-derived Ida GAHM winds, NOAA gauge / buoy time series, model output files) at DOI 10.17603/DS2-7HBT-EF65 (DesignSafe-CI project PRJ-4678). Both /api/projects/v2/PRJ-4678/ and /api/datafiles/listing/public/designsafe.storage.published/PRJ-4678/ return HTTP 401 to anonymous requests, so the deposit cannot be re-fetched by an unauthenticated reproducer. The fort.26 files for the Ida run are reprinted in the paper's Appendix on pp. 12-13.

3. NDBC wave-buoy 42007 absent from the 2021 historical archive. Loveland's Table 3 lists 10 buoys; only 9 are retrievable for Ida from NDBC's URL pattern. The missing buoy is recorded in data/raw/sources.json.

4. Per-storm wall-clock-cost framing granularity. Loveland's §5.1 prose (\"around 1.5 times longer\") and §6 prose (\"about a 40 percent increase\") cover Hurricane Ike (Gen3/Gen1 = 1.44×, Gen3/Gen2 = 1.52×) but understate Hurricane Ida (Gen3/Gen1 = 1.70×, Gen3/Gen2 = 1.76×) by approximately 20 percentage points relative to Loveland's own Table 4. This is not a deviation from method — the replication uses the same per-storm cells from Table 4 — but is surfaced by the Outcome because Loveland's prose summary lacks this per-storm distinction. See Outcome Limitations item 5.""";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q1337681>;
  sciencelive:hasMethodologyDescription """This study is a stat-level (table-and-figure-level) reproduction, deliberately distinguishing what was independently re-derived from what was transcribed from the source paper. The downstream reader should not infer model-level verification.

INDEPENDENTLY RE-DERIVED (notebooks/01_data_download.py + 02_data_clean.py):
- Observational baseline. NOAA CO-OPS water-level gauge time series for the 14 Ike stations (Table 1 of the paper) over 5-14 September 2008 and the 13 Ida stations (Table 2) over 26 August - 4 September 2021, downloaded fresh from the public CO-OPS API. NDBC wave-buoy data (significant wave height, peak period, mean wave direction) for 10 buoys (Table 3), 9 of which were retrievable; buoy 42007 is absent from NDBC's 2021 historical archive at the URL pattern. Cleaned into tidy per-storm xarray Datasets per-variable, with per-variable resampling onto a common time grid (a naive nearest-reindex collapses NDBC's 10-min wind interleave into NaN-filled wave records; per-variable resampling avoids this).
- Storm-peak consistency check. Per-gauge peak water levels were extracted (Galveston Pier 21 = 3.20 m on 13 September 2008; Grand Isle = 1.65 m on 29 August 2021) and verified against NHC historical reports as a sanity check on the observational baseline Loveland modelled against.

TRANSCRIBED FROM THE SOURCE PAPER (notebooks/03_analysis.py, data/published_baselines/):
- All model-side outputs. Run times (Table 4) for each (storm, config) cell, WSE-RMSE per gauge (Tables 5-6 and the prose summaries in §5.3 of the paper), Hs / Tp / mean-wave-direction RMSE per buoy (Tables 5-7 transcribed from §5.2). These values come from Loveland's Frontera-scale ADCIRC+SWAN runs and were not re-computed.
- Source-term configuration files. The fort.26 files printed in the paper's Appendix (pp. 12-13) were inspected for documentation but not used to drive any new SWAN runs.
- Spatial figures. The paper's Fig. 9 (spatial Hs differences near hurricane tracks) and Fig. 12-13 (spatial WSE differences) were inspected for qualitative context but not regenerated.

What the comparison therefore tests: the internal consistency of Loveland's published model-vs-observation Δ values with the publicly retrievable observational baseline, and the per-storm trade-off ratios visible in Loveland's own Table 4. Not the reproducibility of the model runs themselves.

The headline statistic (Gen3 / (Gen1 or Gen2) wall-clock ratio per storm, and maximum WSE-RMSE Δ across source-term configurations per storm) is consolidated in results/headline_comparison.csv and visualised in figures/main_result.png. Orchestrated via Snakemake (pipeline rules in Snakefile); reproducible environment via pixi (pixi.toml + pixi.lock); container build via Dockerfile to ghcr.io/annefou/coastal-rom-replication.""";
  sciencelive:hasScopeDescription """In scope: the conditional trade-off claim from Loveland's §6 Conclusions as carried verbatim by the Quote — that reduced-order SWAN source terms (Gen1 first-generation, Gen2 second-generation) save computation relative to the third-generation ST6 Gen3 package without compromising water-surface-elevation (WSE) accuracy at NOAA gauges when WSE is of primary interest, with Loveland's own quantified caveat that large source-term sensitivities are observed in significant-wave-height fields near hurricane tracks. The two storm scenarios (Hurricane Ike 2008, Hurricane Ida 2021) and the four model configurations (No SWAN, Gen1, Gen2, Gen3) are tested in full.

Out of scope: (1) operational forecasting where wind-field uncertainty dominates the error budget — a second conditioning Loveland states in §6 (\"if the meteorological forcing is not sufficiently accurate ... the additional computational cost associated with the detailed Gen3 source terms may not improve accuracy of the model\") that the chosen Quote does not carry; the Outcome's Validated label therefore applies to the hindcasting regime only. (2) Model-level reproducibility of the ADCIRC+SWAN runs themselves — see Methodology field for what was and was not independently re-derived. (3) Generalisation beyond the Gulf of Mexico or beyond Loveland's two test storms.""";
  sciencelive:targetsClaim <https://w3id.org/sciencelive/np/RAnDNlZ87EvWIfh3BqqBf3b5BtZYyJ7z8QYZv-bXhbAF0/loveland-2024-rom-tradeoff-claim> .

<https://w3id.org/sciencelive/np/RAGtxXgvYl-b7NOkyS3K34z1yDDkPqzGuiTr54nA2uH7U/decrop-2025-reproduction-study>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Independent reproduction of Decrop 2025 phytoplankton CNN metrics";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q113009580>,
    <http://www.wikidata.org/entity/Q1693>, <http://www.wikidata.org/entity/Q17084460>,
    <http://www.wikidata.org/entity/Q184755>, <http://www.wikidata.org/entity/Q2539>,
    <http://www.wikidata.org/entity/Q7173>, <http://www.wikidata.org/entity/Q98526763>;
  sciencelive:hasDeviationDescription "None of substance. The reproduction uses the same code package (planktonclas), the same released model weights, and the same test.txt partition that the original evaluation reports against. One minor naming note: Table 1 of the paper lists test=33,829 and val=33,718, whereas the released split files have those numbers swapped (test.txt=33,718, val.txt=33,829). This study uses the released test.txt, which is the file the model was actually evaluated on. Hardware differs (CPU instead of GPU); CPU inference is exact arithmetic and does not change metric values.   ";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q7173>;
  sciencelive:hasMethodologyDescription "The package planktonclas==0.2.0 can be installed from PyPI — the same code package authored by Decrop and colleagues and used to train the original model. The pretrained EfficientNetV2-B0 weights and the dataset split files are downloaded from Zenodo 10.5281/zenodo.15269453, and the FlowCam        phytoplankton image dataset from Zenodo 10.5281/zenodo.10554845. Inference is run on the 33,718 images listed in the released test.txt using planktonclas.test_utils.predict, which applies 10-crop test-time augmentation. Top-1, top-5, micro F1, macro F1, and weighted F1 were computed via scikit-learn against the integer labels in test.txt. ";
  sciencelive:hasScopeDescription "This study reproduces the headline classification metrics reported by Decrop et al. 2025 for the EfficientNetV2-B0 phytoplankton classifier on the held-out test partition: top-1 accuracy, top-5 accuracy, and the macro/micro/weighted F1 scores. We use the authors' publicly released pretrained weights, the exact train/val/test split files distributed alongside the model, and the same 95-class taxonomy.";
  sciencelive:targetsClaim <https://w3id.org/sciencelive/np/RAQbvusYubgaYlU7YEPIgPmTwqDoylwiq5FKyrVgF95qM/phytoplankton-claim> .

<https://w3id.org/sciencelive/np/RA0MAQq87wUQcfDjOecXxXlBcJBCh9X5CcUO4YgefL_6o/earthcare-dggs>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "EarthCARE Level-2 to HEALPix DGGS conversion pipeline (MSI, ATLID, CPR)";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q117479905>,
    <http://www.wikidata.org/entity/Q1277576>, <http://www.wikidata.org/entity/Q199687>,
    <http://www.wikidata.org/entity/Q5629401>, <http://www.wikidata.org/entity/Q757520>;
  sciencelive:hasDeviationDescription "No prior implementation of this claim exists. This is the first end-to-end pipeline demonstrating EarthCARE Level-2 → HEALPix DGGS conversion on the WGS84 ellipsoid. We therefore classify the study as a Replication (new evidence produced) rather than a Reproduction (re-running prior code), following FORRT terminology.  ";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q1348989>;
  sciencelive:hasMethodologyDescription "Python package with 7 Jupyter notebooks: (1) data download via earthcarekit; (2) structural exploration of Level-2 products; (3) 2D Multi-Spectral Imager (MSI) swath → HEALPix cells with per-pixel nearest-cell assignment and type-aware aggregation (mode for classification variables, mean for continuous variables, quadrature-in-variance for uncertainty variables); (4) 1D Atmospheric Lidar (ATLID) and Cloud Profiling Radar (CPR) profiles → HEALPix cell identifier with preserved vertical axis; (5) DGGS-Zarr persistence for downstream analysis. WGS84 geodetic cell placement provided by healpix-geo (10.5281/zenodo.19337734); xarray DGGS integration via xdggs (10.5281/zenodo.14216728); reproducible Pixi environment. Released as v0.1.0 on GitHub and archived on Zenodo (10.5281/zenodo.19709327). ";
  sciencelive:hasScopeDescription "Conversion of EarthCARE Level-2A products (MSI_AOT_2A, MSI_COP_2A, ATL_AER_2A, ATL_ALD_2A, ATL_CTH_2A, ATL_EBD_2A, ATL_ICE_2A, CPR_CLD_2A, CPR_FMR_2A) to a HEALPix Discrete Global Grid System representation on the WGS84 ellipsoid, covering both 2D swath imagery and 1D vertical atmospheric profiles.";
  sciencelive:targetsClaim <https://w3id.org/np/RAohMnYpM5g2aHmbJ89XYruyjMER-Jgh8Z7P9IFpsWHAE/earthcare-dggs> .

<https://w3id.org/sciencelive/np/RAW_JRkV23_hp26WaP0POGns07s3eJXzItiJEINUIcdes/cross-domain-few-shot>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Reproducing cross-domain few-shot benchmark on Sentinel-2 with matching architecture";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q110797734>,
    <http://www.wikidata.org/entity/Q1425625>, <http://www.wikidata.org/entity/Q197536>,
    <http://www.wikidata.org/entity/Q6822311>, <http://www.wikidata.org/entity/Q816747>;
  sciencelive:hasDeviationDescription "We reimplemented the ResNet-10 architecture from Guo et al.'s published code rather than running their original codebase (which requires Python 3.5 and PyTorch 0.4). Our implementation uses PyTorch 2.11. Minor differences in random seed and data loading may exist.";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q21198>;
  sciencelive:hasMethodologyDescription "We built a ResNet-10 backbone (4.9 million parameters) matching Guo et al.'s custom implementation: SimpleBlock residual blocks, custom weight initialization, 7×7 average pooling for 224×224 pixel input images. Training used 40,000 episodic steps on mini-ImageNet (38,400 photographs of everyday objects) with data augmentation (random resized crop, colour jitter, horizontal flip). Evaluation on EuroSAT (27,000 real Sentinel-2 satellite patches, 10 land cover types) over 200 random 5-way tasks with 5, 20, and 50 labeled examples.";
  sciencelive:hasScopeDescription "Reproducing the exact Prototypical Networks results from Guo et al. (2020) Table 1 on EuroSAT satellite imagery, matching their architecture, image resolution, and training procedure. ";
  sciencelive:targetsClaim <https://w3id.org/np/RALaapCREr1eEa-abjVo3M7Ago23rN2ISt3gMjYXq2cKo/cross-domain-claim> .

<https://w3id.org/sciencelive/np/RAzP6xzTxbXWC9hJDkdp2M5gq4di4cVggPvxRPnBh9-1k/sst-reproduction-datras>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Reproduction of SST-fish community claim using ICES DATRAS";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q1507383>,
    <http://www.wikidata.org/entity/Q152>;
  sciencelive:hasDeviationDescription "Filters applied: shelf depth ≤200m, fish species only (via WoRMS classification), Baltic surveys (BITS, SE-SOUND) excluded. 258 species and 247 grid cells vs Rutterford's 198 species and 193 grid cells due to different filtering thresholds.";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q7150>;
  sciencelive:hasMethodologyDescription "Fish abundance data (catch per unit effort) from standardised bottom trawl surveys were matched to sea surface temperature (SST) and salinity from Bio-ORACLE environmental layers. Sampling locations were aggregated into 1x1 degree grid cells. Community structure was analysed using Principal Coordinates Analysis (PCoA) on Bray-Curtis dissimilarity between grid cells. The relative importance of SST, salinity, and depth as drivers of community structure was assessed by correlating each environmental variable with the main axes of community variation.";
  sciencelive:hasScopeDescription "The full claim is tested: whether SST is the primary environmental driver of fish community structure on the NE Atlantic continental shelf, using the same ICES DATRAS trawl survey data source as Rutterford et al.";
  sciencelive:targetsClaim <https://w3id.org/np/RA-DG_PBY8ddwmtYQzkFuejXePhrmJgUZyP65mupugNNg/sst-fish> .

<https://w3id.org/sciencelive/np/RAZXfd2G6MUaJKRnxBPnUGHk5ATg5NNFGkEDWE9YiU_fo/zkp-replication-walkthrough>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "ZKP compliance verification applied to biodiversity monitoring";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q1749732>,
    <http://www.wikidata.org/entity/Q191943>, <http://www.wikidata.org/entity/Q503021>,
    <http://www.wikidata.org/entity/Q625376>;
  sciencelive:hasDeviationDescription "Different domain (water quality monitoring vs carbon emissions), single-party proof (no supply chain), no EdDSA signature verification inside circuit. Tests the core claim that zk-SNARKs can verify environmental compliance without revealing private data.";
  sciencelive:hasMethodologyDescription "Circom 2.1.9 circuit with snarkjs 0.7.6 using Groth16 protocol. Synthetic hydromet data modelled on LifeWatch ERIC Donana monitoring stations. Circuit enforces that each reading is non-negative and strictly below a public threshold. Proof generation and verification measured on commodity hardware. ";
  sciencelive:hasScopeDescription "Applied zk-SNARK-based compliance verification to water quality monitoring at Donana National Park, a Natura 2000 site monitored by LifeWatch ERIC. Implemented a Groth16 circuit in Circom that proves 24 hourly conductivity readings are below the EU Water Framework Directive threshold without revealing individual values.  ";
  sciencelive:targetsClaim <https://w3id.org/np/RAb2CZzu-V3tA9Y1Bw6SWNbVLvqzREa-1R3z6BNoqTUT0/zkp-privacy-walkthrough> .

<https://w3id.org/sciencelive/np/RAxDONwg2TNpjjbveE8o4JoOSTlshLytU1PeUkPteZuYY/odrl-replication-walkthrough>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Replication of FAIR2Adapt ODRL access control on synthetic biodiversity dataset";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q141090>,
    <http://www.wikidata.org/entity/Q228502>, <http://www.wikidata.org/entity/Q29032648>,
    <http://www.wikidata.org/entity/Q3883056>, <http://www.wikidata.org/entity/Q57814310>;
  sciencelive:hasDeviationDescription "Uses synthetic biodiversity data instead of real sensitive research data. Consumer DID is a walkthrough identity (did:web:fair2adapt.github.io:fair-data-access:example-consumer) rather than a real researcher's DID. Access request is simulated locally rather than through the GitHub Actions automated workflow.";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q1149776>;
  sciencelive:hasMethodologyDescription """Using the fair-data-access Python framework: 
(1) generate AES-256-GCM dataset key and encrypt a synthetic biodiversity CSV, 
(2) publish an ODRL Offer policy as a signed nanopub with Public Benefit purpose constraint, 
(3) simulate a consumer access request with a did:web identity,
(4) evaluate the ODRL policy against the declared purpose, (5) wrap the dataset key using ECDH key agreement with the consumer's public key,
(6) unwrap and decrypt on the consumer side, (7) verify data integrity by comparing decrypted output to the original.""";
  sciencelive:hasScopeDescription "End-to-end test: encrypt dataset, publish ODRL policy as nanopub, evaluate automated access request, wrap key for consumer DID, decrypt, verify integrity.";
  sciencelive:targetsClaim <https://w3id.org/np/RAjogJjeq6z2ny-eVPXxQWP4s2QAISBuoLsBifs4IWR_A/odrl-automated-access-control> .

sub1:kl-study-vjea9aobg7 a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Replication study: Determinants of Bactrocera oleae abundance in olive groves";
  sciencelive:hasLoomRecord <https://knowledgeloom.tib.eu/resource/vjea9aobg7>;
  sciencelive:hasMethodologyDescription "Analysis types: Regression analysis. Machine-readable descriptions generated using dtreg and published in the TIB Knowledge Loom.";
  sciencelive:hasScopeDescription "Reproduction of analyses from Knowledge Loom record: Determinants of Bactrocera oleae abundance in olive groves. In a study of 25 olive groves in the Beira Interior region of Portugal, landscape simplification (i.e. the share of the area surrounding the olive groves covered by other olive groves) and landscape diversity (i.e. landscape Shannon diversity index) displayed the most notable effects on Bactrocera oleae abundance in olive groves. The bivariate generalized linear models show that B. oleae abundance increases with decreasing landscape complexity and diversity.";
  sciencelive:targetsClaim <https://w3id.org/sciencelive/np/RACSKwvXe5B441qTr7AuTzmT74g03VRuERc-31egcScZI/RAd81Aae_Buc4blS-RseBVlFybfD32gxisppiho-INgWg/kl-claim-d9h0kie3p9>,
    <https://w3id.org/sciencelive/np/RACSKwvXe5B441qTr7AuTzmT74g03VRuERc-31egcScZI/RAgqt-UhDLp-Zl-1zBTrFyCOHj_bMQWnoaYRR3eTGdKGw/kl-claim-9p3urx5m79> .

<https://w3id.org/sciencelive/np/RARq1rfJJbMI_sjtBgSltkPIrYkOecD6UPkOiCh8GLSqo//kl-study-vjea9aobg7>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Replication study: Determinants of Bactrocera oleae abundance in olive groves";
  sciencelive:hasLoomRecord <https://knowledgeloom.tib.eu/resource/vjea9aobg7>;
  sciencelive:hasMethodologyDescription "Analysis types: Regression analysis. Machine-readable descriptions generated using dtreg and published in the TIB Knowledge Loom.";
  sciencelive:hasScopeDescription "Reproduction of analyses from Knowledge Loom record: Determinants of Bactrocera oleae abundance in olive groves. In a study of 25 olive groves in the Beira Interior region of Portugal, landscape simplification (i.e. the share of the area surrounding the olive groves covered by other olive groves) and landscape diversity (i.e. landscape Shannon diversity index) displayed the most notable effects on Bactrocera oleae abundance in olive groves. The bivariate generalized linear models show that B. oleae abundance increases with decreasing landscape complexity and diversity.";
  sciencelive:targetsClaim <https://w3id.org/np/RA5mDbNSNCyV0vWJJQbscM7lCCILGwqrVsL8XuMM3mbHc>,
    <https://w3id.org/np/RApFSeXYamR1ZJ_L5HGZs0asF9Zx4h5K-uqSk7X8gfRCA> .

<https://w3id.org/np/RADCSRkRrlaOzRZ-lkPh1dnvthkhqFz55eGr1wtpW03vk/dggs-replication-2026>
  a sciencelive:FORRT-Replication-Study, sciencelive:Reproduction-Study;
  rdfs:label "Reproduction and Replication of DGGS Benchmark";
  <http://www.w3.org/2004/02/skos/core#related> <http://www.wikidata.org/entity/Q117023379>,
    <http://www.wikidata.org/entity/Q117479905>, <http://www.wikidata.org/entity/Q121775330>,
    <http://www.wikidata.org/entity/Q1425625>, <http://www.wikidata.org/entity/Q816747>;
  sciencelive:hasDeviationDescription """1. SCALE: The original paper tested up to 500 vector layers; our default configuration tests [5, 10, 20, 50, 100] layers but supports scaling to 500.

2. RASTER GENERATION: The paper used NLMpy mid-point displacement algorithm. Our implementation uses NLMpy when available, with Gaussian filter fallback.

3. RANDOM MISALIGNMENT: The paper mentions \"jittering the origin point by up to one pixel\" for raster alignment - this feature is not implemented in our reproduction.

4. ADDITIONAL COMPARISON: We added xdggs as an alternative DGGS implementation not present in the original study, extending the work from pure reproduction to include replication with different tools.

5. PRE-INDEXED SCENARIO: The paper's raster benchmark used pre-indexed data in Apache Parquet queried with Polars. Our benchmark includes both on-the-fly indexing and pre-indexed scenarios to enable direct comparison.""";
  sciencelive:hasDiscipline <http://www.wikidata.org/entity/Q8008>;
  sciencelive:hasMethodologyDescription """REPRODUCTION METHODOLOGY:
- Vector benchmark: Implemented H3 polyfilling algorithm via h3-py library to convert Voronoi polygons to H3 cells at resolution 14, matching the paper's approach
- Raster benchmark: Used H3 Python loop (h3.latlng_to_cell) to index raster pixels to H3 cells, replicating the paper's indexing method
- Classification: Implemented all 7 number-theoretic classification functions (prime, perfect, triangular, square, pentagonal, hexagonal, Fibonacci) as described in the paper
- Data generation: Created synthetic Voronoi polygons and NLM raster landscapes following the paper's specifications

REPLICATION METHODOLOGY:
- Raster benchmark: Replaced H3 Python loop with xdggs library (xdggs.H3Info.geographic2cell_ids) for vectorized coordinate-to-cell conversion
- This tests whether alternative DGGS implementations affect the benchmark conclusions

COMPUTATIONAL ENVIRONMENT:
- Python 3.11 with h3 4.x, xdggs, NumPy, GeoPandas, Polars
- Docker container for reproducibility
- Benchmarks run on standardized hardware with multiple iterations""";
  sciencelive:hasScopeDescription """This study aims to reproduce and replicate the computational benchmark experiments from Law & Ardo (2024) \"Using a discrete global grid system for a scalable, interoperable, and reproducible system of landuse mapping\" (DOI: 10.1080/20964471.2024.2429847).

Specifically:
1. VECTOR BENCHMARK (Figure 6): Reproduces the comparison between traditional vector overlay operations and DGGS-based methods using H3 polyfilling, testing scalability across 5-500 input layers.

2. RASTER BENCHMARK (Figure 7): 
   - REPRODUCTION: Recreates the paper's comparison using H3 Python bindings for coordinate-to-cell conversion
   - REPLICATION: Implements an alternative approach using xdggs for vectorized H3 indexing

The study aims to validate the paper's claims that (1) DGGS provides orders of magnitude performance improvement for vector operations, and (2) DGGS and raster methods show roughly equivalent performance for raster operations when using pre-indexed data.""";
  sciencelive:targetsClaim <https://w3id.org/np/RARXnKVStRazNmNbVB8YWn8iq0ctvZc8dl_5gtRSNXxsk/aida_dggs_interoperability> .
