Research Methodology

SentiSense Methodology

Version

v1.1

Effective

June 19, 2026

Next Review

December 2026

Published by

Compass AI Data Services, LLC

This page describes how SentiSense collects data, processes it, and publishes research. It is written for practitioners: traders, engineers, and AI agents integrating our API. It is not a legal agreement and does not modify the Terms of Service or AI Disclaimer.

We publish it because opacity in financial data products is a red flag. We would rather give our users the tools to be skeptical.

1.Coverage

SentiSense covers publicly listed US equities, with emphasis on S&P 500 constituents, and is actively expanding. We track companies, institutional investors (13F filers), corporate insiders (Section 16 reporters), and US political officeholders subject to STOCK Act disclosure, with derived signals at the sector and index level.

Out of scope today: options pricing, private company valuations, cryptocurrencies, FX, commodities, and fixed income.

2.Data sources

We ingest from public financial press, SEC EDGAR filings, government disclosure portals, company investor relations pages, and curated social sources (selected public communities and partner channels with consent). We license market data and fundamentals from commercial providers.

We deliberately do not ingest paywalled content we are not licensed to redistribute, leaked material non-public information, or anonymous tip channels.

3.How we use AI

SentiSense uses AI across the pipeline. Precisely:

AI handles sentiment classification, entity resolution, news clustering, structured extraction from unstructured text, and the drafting of summaries and research reports from structured inputs.

AI does not generate investment recommendations, set price targets, take directional positions on covered companies, fabricate data, or rewrite primary source quotations.

Our research is built to present multiple perspectives. When a company reports earnings, we cover the bull case and the bear case. When institutional flows shift, we describe what that implies and what it does not. The system is designed to help readers think, not to think for them. Each report links its factual claims to source documents. If we cannot cite it, we do not claim it.

4.Scoring

Sentiment scores are real-valued in the range of minus one to plus one. Aggregated sentiment is a reliability-weighted mean of constituent document scores. Sources are tiered by reliability, and the tier is exposed on per-document API responses.

A single high-confidence document can constitute a signal. A cluster of low-reliability documents typically does not, regardless of volume. We do not equate loud with significant.

5.Institution rankings: concentration and AUM

Inside the Trackers tab, SentiSense publishes two institution-ranking trackers built from SEC 13F filings: a top-10 concentration ranking and a largest-by-AUM ranking. This section explains what they measure and where the data is thin.

What gets ranked

The ranked universe is every institutional investor that files Form 13F with the SEC. Filing is mandatory for institutional investment managers with at least $100 million in US equity assets under management, due 45 days after each quarter end: Berkshire Hathaway, Vanguard, BlackRock, Citadel, university endowments, family offices, and roughly eight thousand others.

Each 13F discloses long positions in US-listed equities held at the quarter-end snapshot. That is the entire input. The 13F does not contain short positions, options, fixed income, commodities, FX, private holdings, leverage, cash balances, or fees. Both rankings are computed strictly on the long US-equity book a filer chose to disclose, so read them as a view of disclosed positioning, not of total fund size or performance.

What the two rankings measure

Concentration ranks filers by the share of their disclosed book held in their top 10 positions: a proxy for conviction. A high share means a few names carry the portfolio; a low share means it is spread thin.
AUM ranks filers by the total market value of their disclosed long US-equity book at the snapshot. This is disclosed-equity AUM, not a firm's total assets across all strategies and asset classes.

Coverage

Both rankings need a price for each holding at the quarter end. A filer whose book prices cleanly is a measurement; one with thin pricing coverage (non-listed securities, foreign listings outside our US-equity filter, or names delisted between filing and our run) should be read as a sample. We compute on the disclosed long US-equity book and nothing else.

Update cadence

New quarters are computed once the quarter is settled and the relevant 13Fs are filed (over the 45 days after quarter end); earlier filings refine the figure in place. Historical quarters are immutable unless we find a data-quality issue, in which case we re-run and note it in the change log.

6.Hedge fund reported returns: net-of-fee numbers funds publish themselves

The Hedge fund reported returns tracker records the net-of-fee returns large hedge funds state publicly, in the press and in their own investor communications. Rather than infer performance from disclosed 13F positions, every cell is a number a fund actually reported, with a citation you can open and read.

Where the numbers come from

These are not figures we compute. They are net-of-fee returns that funds, or reporters citing fund letters, have published: full-year annual returns and, where available, the latest year-to-date or interim figure. Each cell carries the primary source as a link and a short quote, so you can verify the number against its origin rather than trusting an aggregate. Coverage at launch spans roughly two dozen of the most-watched funds, with several years of annual history each.

How to read a cell

The headline column is each fund's latest reported annual return. Because funds report on their own schedule, the year that number refers to varies by fund, so the year is shown next to each value rather than in the column header: a fund that has reported 2025 shows 2025, one that has only reported through 2024 shows 2024. A blank cell means we have not found a citable public number for that fund and period, not that the return was zero. Some funds are deliberately opaque and may show only a single year.

The two alpha columns

Alongside the reported return, each fund carries two context columns. Alpha vs SPY is simply the fund's reported net return for a year minus the S&P 500 total return (dividends reinvested) for that same calendar year, so you can see the figure relative to just owning the index. 13F long-book alpha is a different, independent lens: the trailing one-year alpha of the firm's disclosed 13F long-equity book versus SPY, computed from its quarterly filings. It is deliberately one leg of the book, the long US-listed stocks only, not the fund's shorts, derivatives, fixed income, or private positions. For a market-neutral or macro fund the two columns can diverge sharply, and that gap is the point: it hints at how much of the story the long equity book alone tells. Read the 13F column as a cross-check, not as the fund's actual return.

What this tracker is, and is not

It is a faithful record of what funds have said publicly, net of fees. It is not an audited, standardized, or independently verified performance series: funds choose what to disclose and when, headline numbers can reflect a flagship share class rather than every vehicle, and selective reporting means strong years are more likely to be publicized than weak ones. Read it as sourced public claims, useful precisely because each one is citable, not as a like-for-like benchmark across managers.

Update cadence

New annual numbers are added as funds report them, which clusters in the first quarter of each year. We update existing cells in place if a fund restates or a better primary source appears, and we keep the original citation on every number so revisions are traceable.

7.What Reddit is buying

This tracker reads public posts across a fixed allowlist of finance subreddits and builds a standing portfolio of the tradeable stocks the crowd has turned bullish on. A name enters when it is among the most-mentioned over a trailing window and sentiment leans bullish; its entry date and price are then frozen, and we score its return since entry against the S&P 500. The headline compares the equal-weighted portfolio to the index over the same windows, refreshed roughly monthly.

Read it as inferred positioning, not stated trades: mention volume and tone show what the crowd is talking about and how it leans, not what anyone actually bought. It is survivorship-free (entries stay, so weak picks are not dropped to flatter the average), skews toward large, liquid, heavily discussed names, and reflects the market regime. We exclude untradeable names and broad-market index funds, and omit any row we cannot price rather than show a fabricated return. Not advice.

8.Limitations

Small-cap and non-US coverage is thinner than large-cap US.
Our models are trained predominantly on English-language content.
Known model weaknesses include sarcasm, ironic framing, conservative forward guidance, and corporate euphemism.
For time-sensitive signals, end-to-end latency from publication is constrained. Users building low-latency strategies should design for this bound.

Sentiment is not causation. A bearish shift describes the tone of available information; it does not predict price. We do not publish price targets or trade recommendations.

9.Governance

This methodology is reviewed semi-annually and on any material change to our data sources, models, or scoring. Changes are recorded in the change log. Prior versions remain accessible.

10.Disclosures

No personalized advice. All published signals, reports, and scores are informational and educational. See the AI Disclaimer.
No compensation from covered entities. We do not accept payment in exchange for coverage, rating, or placement.
Team members may trade. Members of our team actively participate in the markets and use SentiSense as a research tool. Published research is generated by AI systems built for balanced analysis and does not reflect any team member's personal positions.
Data licensing. Commercial licensing relationships do not influence the content or scoring of our published research.

11.Change log

Version	Date	Change
v1.0	June 2026	Initial publication: sentiment methodology, the institution rankings, hedge fund reported returns, and the Reddit positioning tracker.

Questions, corrections, feedback: support@sentisense.ai