The Superhost Signal — NYC Airbnb ML Analysis

The Question

Airbnb's most powerful status signal is a badge most guests never think about — and most hosts don't fully understand.

If you've ever booked an Airbnb, you've probably seen the Superhost badge. A small orange label, easy to overlook. But in the data, it shows up everywhere — in booking rates, in search rankings, in the gap between a listing that gets chosen and one that doesn't. Superhosts don't just feel like better hosts. They are statistically, measurably, significantly more likely to get booked.

So what actually makes a Superhost? The official answer from Airbnb involves response rate, review scores, cancellation rate, and a minimum number of stays per year. But those are the criteria — not the underlying behavioral patterns that produce them. The real question is: if you trained a machine learning model on thousands of listings without telling it anything about these rules, what would it learn?

That's what this project set out to answer. Using data from Inside Airbnb's December 2025 snapshot of New York City — 36,261 active listings, 85 raw features — we built a full machine learning pipeline to predict Superhost status. The findings are more interesting than we expected.

"The features that most strongly predict Superhost status are not the ones you'd put in a real estate listing. They're the ones that reflect how seriously you take being a host."

36,261

Active listings
analyzed

81%

Regular hosts
(class imbalance)

Supervised models
trained & compared

Finding 01

Manhattan has the most listings. It also has the fewest Superhosts.

New York's five boroughs are, in many ways, five different Airbnb markets. Manhattan dominates in sheer listing volume — more than 16,000 active listings, roughly 45% of the entire city dataset. Brooklyn follows with just over 13,000. Staten Island, by contrast, has fewer than 1,000.

But when you look at Superhost rates rather than raw listing counts, the ranking inverts completely. Staten Island, the borough with the fewest listings, has the highest Superhost rate in the city — 32.1%. Manhattan, the borough with the most listings, has the lowest: just 15.2%.

This is not a statistical quirk. It reflects something fundamental about the difference between these hosting populations. Manhattan is dominated by professional property managers — people running multiple listings as a business. They optimize for volume and availability, not for the kind of personal engagement that earns consistent 5-star reviews and the response rates Airbnb rewards. Outer boroughs, by contrast, tend to be individual homeowners renting their actual homes. They invest personally in the experience.

Our model confirmed this intuition. Host listing count — the number of listings associated with a given host — turned out to be the single most predictive feature in the entire dataset. And its direction is counterintuitive: hosts with fewer total listings are significantly more likely to be Superhosts.

Superhost Rate by NYC Borough

Percentage of listings with Superhost status — overall average 18.9%

The gap between Staten Island (32.1%) and Manhattan (15.2%) is 17 percentage points. In a dataset of 36,000 listings, that's not noise — it's signal. It tells you that where a listing is located shapes the type of host who runs it, and the type of host who runs it shapes their chances of achieving Superhost status.

"The most listing-dense borough in New York City produces the fewest Superhosts. Scale and excellence, it turns out, do not always travel together."

Finding 02

NYC's hosting landscape isn't a spectrum. It's four distinct tribes.

Before building a predictive model, we asked a more open-ended question: does the data naturally group listings into identifiable types? Using K-Means clustering — an unsupervised technique that finds structure without being told what to look for — we let the algorithm divide 14,112 fully-observed listings into clusters based purely on behavioral and quality features.

We tried k values from 2 to 9, evaluating each using both the elbow method (which measures within-cluster inertia) and the silhouette score (which measures how well-separated clusters are). The mathematical optimum was k=2, but two clusters isn't interpretively rich enough to be interesting. We chose k=4, which is supported by the elbow plot's inflection point and produces four meaningfully different groups.

The four clusters that emerged map almost perfectly onto intuitive hosting archetypes — and their Superhost rates range from 2.3% all the way to 64.7%.

2.3%

Superhost rate

The Disengaged Host

Cluster 0 · n = 1,022

Low response rates and low acceptance rates, despite having adequate review scores. These hosts are likely semi-inactive — listed on the platform but rarely responsive. The near-zero Superhost rate confirms that behavioral engagement matters as much as listing quality.

35.1%

Superhost rate

The Typical Active Host

Cluster 1 · n = 7,858 (largest)

The backbone of NYC's Airbnb market. These hosts have solid review scores and genuine engagement, but haven't achieved the consistency required across every dimension for Superhost status. Many in this cluster are close — making them the highest-priority targets for platform support.

5.6%

Superhost rate

The Struggling Host

Cluster 2 · n = 576

High response and acceptance rates, but below-average review scores across most dimensions. These hosts are engaged and willing, but guests are consistently disappointed. The issue is likely property-level — misleading listings, cleanliness gaps, or amenity shortfalls — rather than availability or responsiveness.

64.7%

Superhost rate

The Elite Host

Cluster 3 · n = 4,656

High occupancy, frequent recent reviews, near-perfect scores, and exceptional responsiveness. Nearly two-thirds are Superhosts. The remainder likely fall just short on one criterion. This is the aspirational cluster — and studying what separates it from Cluster 1 is the most actionable insight in the dataset.

What makes this finding powerful is what the algorithm didn't know. It had no access to the Superhost label during clustering. It found these four groups purely from behavioral patterns — and yet they stratify perfectly by Superhost rate. That's not a coincidence. It means Superhost status reflects real, measurable behavioral differentiation, not arbitrary platform decisions.

Key insight: The 62.4 percentage-point gap in Superhost rates between the Disengaged Host (2.3%) and Elite Host (64.7%) clusters was discovered entirely without supervised labels. The algorithm found the structure. The Superhost label simply confirmed it.

Finding 03

The strongest predictors of Superhost status are things you can actually change.

Our Random Forest model — after systematic hyperparameter tuning across 60 cross-validated fits — learned to predict Superhost status with an AUC-ROC of 0.962. One of the most valuable outputs of a Random Forest is its feature importance ranking: a measure of which variables the ensemble of 200 decision trees found most useful for making correct predictions.

The top features reveal a clear pattern. The eight strongest predictors are all behavioral or activity metrics — not a single property characteristic makes the list. Your number of bedrooms, your room type, your neighborhood — none of it shows up in the features that matter most.

What does show up: how recently guests have reviewed your listing, how often it's actually occupied, how consistently you accept bookings, and how responsively you communicate. These are things a host can act on. They're not fixed attributes of a property — they're choices about how seriously you take the hosting role.

Top 8 Predictors of Superhost Status

Gini importance from tuned Random Forest (200 trees, max_features='sqrt')

The top-ranked feature, Host Listing Count (importance: 0.121), tells an interesting story. More listings correlates with lower Superhost probability. This is the professional property manager effect — scale dilutes the personal engagement that earns elite status. The hosts with the highest Superhost rates tend to have just one or two listings that they treat with genuine care.

Reviews in the last 12 months ranks third (0.108). Not total reviews — recent reviews. This makes sense because Airbnb evaluates Superhost status quarterly. Historical track record matters less than what you've been doing lately. A host who was excellent three years ago and has since become inactive will not hold Superhost status today.

"Your review scores ranked 8th. Your overall rating ranked lower still. The algorithm found that how frequently and recently guests reviewed you matters more than what they actually said."

The practical implication for any host is clear: Superhost status is not primarily about having a beautiful property. It's about maintaining consistent, recent, active engagement with the platform. Response rate, acceptance rate, and booking frequency — these are the levers.

Finding 04

Two models, one winner — and a precision-recall tradeoff worth understanding.

We trained two classifiers on the same 80/20 stratified split of 36,261 listings. The Random Forest was systematically tuned using RandomizedSearchCV — 20 candidate hyperparameter combinations evaluated under 3-fold cross-validation, with AUC-ROC as the scoring target. The Neural Network used a funnel-shaped 256→128→64 architecture with dropout regularization and computed class weights to handle the 81/19 class imbalance.

The headline result is that the tuned Random Forest outperforms the Neural Network on every metric except recall. This is consistent with a well-established finding in applied machine learning: on structured tabular data with mixed feature types, tree-based ensembles reliably outperform neural networks. Neural networks excel on unstructured data — images, text, audio — where they can learn hierarchical representations. Here, the features are already meaningful and interpretable, and the Random Forest extracts that signal more efficiently.

Metric	Random Forest (Tuned)	Neural Network
AUC-ROC	0.9616 ✓	0.9379
Accuracy	0.9149 ✓	0.8385
Precision (Superhost)	0.8300 ✓	0.5441
Recall (Superhost)	0.6800	0.9031 ✓
F1-Score (Superhost)	0.7500 ✓	0.6791

The precision-recall tradeoff between the two models is worth understanding in practical terms. The Random Forest's precision of 0.83 means that when it predicts Superhost, it's correct 83% of the time — a high-confidence prediction. Its recall of 0.68 means it misses about 32% of actual Superhosts, being conservative about who it flags.

The Neural Network flips this: 90% recall (it catches nearly all actual Superhosts) but only 54% precision (more than half of its Superhost predictions are wrong). The Neural Network casts a wide net and accepts many false positives in exchange for fewer misses.

Which is better depends entirely on how you use the model. If you're building a tool to allocate coaching resources to high-potential hosts, you want high precision — you don't want to waste resources on false positives. That's the Random Forest. If you're building an early-detection system where missing a potential Superhost is costly, the Neural Network's high recall is preferable. Both have legitimate uses. They're just different tools for different problems.

One additional finding worth noting: the Neural Network showed signs of overfitting during training, with training loss continuing to decrease while validation loss plateaued around epoch 20. This is a real limitation — more aggressive regularization or a different architecture would likely improve generalization on held-out data.

Summary

Six things the data tells us about what it really takes to become a Superhost.

01 — Borough Paradox

Scale works against you

Manhattan has the most listings and the lowest Superhost rate. Staten Island has the fewest listings and the highest. Professional property management and elite hosting tend not to coexist.

02 — Market Segments

Four tribes, not a spectrum

Unsupervised clustering revealed four host archetypes with Superhost rates from 2.3% to 64.7% — discovered entirely without the Superhost label. The structure is real and the algorithm found it.

03 — Feature Importance

Behavior beats property

The top 8 predictors of Superhost status are all behavioral and activity metrics. Room type, bedroom count, and neighborhood do not appear. What you do matters far more than what you have.

04 — Recency Matters

What you did lately counts most

Reviews in the last 12 months outrank total review history as a predictor. Airbnb evaluates Superhost quarterly — consistent recent performance is more predictive than a long historical track record.

05 — Model Choice

Random Forest wins on tabular data

The tuned Random Forest (AUC=0.962) outperformed the Neural Network (AUC=0.938) on all metrics except recall — consistent with ensemble methods' known advantage on structured tabular features.

06 — Precision vs Recall

The tradeoff is the decision

High precision (RF) vs high recall (NN) is not a question of which model is better — it's a question of what the cost of a false positive is versus a false negative in your specific use case.

"A model that predicts Superhost status with 96.2% AUC is useful. But the feature importance ranking is what actually tells you something actionable about the world."

Conclusion

The badge is a signal. What the data tells you is what kind of signal it actually is.

The Superhost badge is often treated as a proxy for property quality — a sign that the listing itself is better. What this analysis reveals is that it's primarily a proxy for host behavior. The hosts who earn it tend to be engaged, responsive, and active. The ones who don't tend to be absent, selective, or disengaged — regardless of how nice their apartment is.

This matters for both sides of the market. For Airbnb as a platform, the clearest use of a model like this is early identification — finding hosts in Cluster 1 (the Typical Active Host, with a 35.1% Superhost rate) who are close to the threshold and could benefit from targeted coaching or nudges around acceptance rate and response time. These are the hosts most likely to respond, and the ones where small behavioral changes would make the biggest difference.

For individual hosts, the finding is even more direct: if you want Superhost status, work on your acceptance rate and response rate before you work on your decor. Keep your listing active and get recent bookings. Reviews in the last year matter more than your lifetime total. These are behaviors, not investments — and they're entirely within reach.

One honest caveat: this is observational data. The model predicts Superhost status from behavioral patterns — it cannot tell us that changing those behaviors will cause Superhost status to follow. There may be confounding factors we haven't measured. A host who artificially inflates their acceptance rate by accepting bookings they'll later cancel will not improve their standing. The relationship is real, but it runs through genuine quality, not through gaming metrics.

"Machine learning doesn't explain the world. It makes the patterns in it legible — and occasionally, those patterns turn out to be more interesting than anyone expected."

The SuperhostSignal

Airbnb's most powerful status signal is a badge most guests never think about — and most hosts don't fully understand.

Manhattan has the most listings. It also has the fewest Superhosts.

NYC's hosting landscape isn't a spectrum. It's four distinct tribes.

The strongest predictors of Superhost status are things you can actually change.

Two models, one winner — and a precision-recall tradeoff worth understanding.

Six things the data tells us about what it really takes to become a Superhost.

Scale works against you

Four tribes, not a spectrum

Behavior beats property

What you did lately counts most

Random Forest wins on tabular data

The tradeoff is the decision

The badge is a signal. What the data tells you is what kind of signal it actually is.

The Superhost
Signal