A New Perspective on the US-China AI Race: 2025 Ollama Deployment Comparison and Global AI Model Trend Insights

Ollama is a prominent open-source toolkit engineered to streamline the local execution, creation, and distribution of Large Language Models (LLMs). It encapsulates model weights, configurations, and associated data within a `Modelfile`-defined package, exposing an API for programmatic interaction with these models. This architecture empowers developers and researchers to efficiently deploy and experiment with a diverse range of advanced AI models on local workstations or server infrastructure.

1. Introduction

Ollama is a prominent open-source toolkit engineered to streamline the local execution, creation, and distribution of Large Language Models (LLMs). It encapsulates model weights, configurations, and associated data within a Modelfile-defined package, exposing an API for programmatic interaction with these models. This architecture empowers developers and researchers to efficiently deploy and experiment with a diverse range of advanced AI models on local workstations or server infrastructure.

This report endeavors to elucidate deployment metrics, model selection patterns, geo-distribution, and network infrastructure characteristics by analyzing telemetry from 174,590 globally deployed Ollama instances.

Note: Data statistics presented in Section 5 and Section 7 are derived from the full dataset of 174,590 instances. Data in Section 6 is sourced from instances with accessible APIs. For security operational reasons (SecOps), Ollama version distribution statistics have been omitted.

Data Snapshot Date: 2025-04-24
Report Origin: Tenthe AI (https://tenthe.com)
Author: Ryan

2. Executive Summary

This report synthesizes findings from network scan data and API endpoint interrogation of publicly accessible Ollama instances worldwide. Key findings include:

Globally, from an initial corpus of approximately 174,590 records identified via Fofa (99,412 unique IP addresses), 41,021 Ollama instances with reachable APIs were successfully probed, distributed across 24,038 unique IP addresses (an effective API accessibility rate of approximately 24.18%).
Geographically, the United States and China exhibit the highest density of Ollama deployments. Cloud Service Providers (CSPs), notably AWS, Alibaba Cloud, and Tencent Cloud, constitute the primary hosting infrastructure for Ollama instances.
Model deployment analysis reveals diversity, with llama3, deepseek-r1, mistral, and qwen series models demonstrating widespread adoption. Among these, llama3:latest and deepseek-r1:latest are the two most frequently deployed model tags.
Models with 7B-8B parameters exhibit highest user adoption, while 4-bit quantized models, such as Q4_K_M and Q4_0, are extensively utilized for their optimal trade-off between performance and resource utilization.
The default port 11434 is predominantly utilized, and a majority of instances expose services via the HTTP protocol.

3. Data Sources and Methodology

The dataset for this report was compiled through a two-stage process:

Initial Discovery Scan: Utilization of public network scanning platforms (e.g., Fofa) with the query app="Ollama" && is_domain=false for initial discovery of candidate Ollama instances deployed globally. This stage yielded 174,590 records, corresponding to 99,412 unique IP addresses post-deduplication.
API Validation and Data Enrichment: Systematic probing of the ip:port/api/tags API endpoint for the initially identified IP addresses to validate Ollama service reachability and retrieve deployed AI model manifests. This stage confirmed 41,021 successfully responsive Ollama instances (originating from 24,038 unique IPs, with data persisted to the ollama table).
The final aggregated dataset is stored in the ollama relational table.

The analysis presented herein is primarily based on data from the ollama table, which contains records from successfully probed API endpoints, including IP address, port, geo-location, and the JSON payload (containing the model list), among other attributes.

4. Overall Deployment Statistics

Number of initial records from Fofa scan: 174,590
Number of unique IPs from Fofa initial scan: 99,412
Number of Ollama instances successfully responding to /api/tags: 41,021 (derived from records where status = 'success' in the ollama table)
Number of corresponding unique IP addresses: 24,038 (derived from records where status = 'success' in the ollama table)
Ratio of accessible IPs to initially identified IPs: (24038 / 99412) * 100% ≈ 24.18%

This indicates that among all Ollama instances discovered via Fofa, approximately a quarter expose the /api/tags endpoint publicly, enabling retrieval of information regarding their deployed model configurations.

5. Geographical Distribution Analysis

5.1 Top 20 Deployment Countries/Regions

The following table presents the top 20 countries/regions ranked by the count of unique IP addresses hosting Ollama instances.

Rank	Country/Region	Unique IP Count
1	United States	29195
2	China	16464
3	Japan	5849
4	Germany	5438
5	United Kingdom	4014
6	India	3939
7	Singapore	3914
8	South Korea	3773
9	Ireland	3636
10	France	3599
11	Australia	3558
12	Brazil	2909
13	Canada	2763
14	South Africa	2742
15	Sweden	2113
16	Hong Kong SAR, China	1277
17	Israel	675
18	Taiwan, China	513
19	Russia	475
20	Finland	308

Ollama Top 20 Deployment Countries/Regions

5.2 Top 20 Global City Deployments

The table below shows the top 20 cities worldwide, ranked by the number of unique IPs with Ollama instances.

Rank	City	Country/Region	Unique IP Count
1	Ashburn	United States	5808
2	Portland	United States	5130
3	Singapore	Singapore	3914
4	Frankfurt am Main	Germany	3908
5	Beijing	China	3906
6	London	United Kingdom	3685
7	Columbus	United States	3672
8	Mumbai	India	3637
9	Dublin	Ireland	3631
10	Tokyo	Japan	3620
11	Sydney	Australia	3487
12	Paris	France	3175
13	San Jose	United States	2815
14	Sao Paulo	Brazil	2753
15	Cape Town	South Africa	2692
16	Montreal	Canada	2535
17	Seattle	United States	2534
18	Hangzhou	China	2447
19	Seoul	South Korea	2327
20	Osaka	Japan	2184

5.3 Top 10 US City Distribution

Rank	City	Unique IP Count
1	Ashburn	5808
2	Portland	5130
3	Columbus	3672
4	San Jose	2815
5	Seattle	2534
6	Westlake Village	1714
7	Boardman	855
8	Florence	776
9	San Francisco	753
10	Boulder	642

5.4 Top 10 Mainland China City Distribution

Deployments in Hong Kong SAR and Taiwan, China are not reflected in this Top 10 cities table as their aggregates are included in the country/region-level statistics.

Rank	City	Country (`country_name`)	Unique IP Count
1	Beijing	China	3906
2	Hangzhou	China	2447
3	Shanghai	China	1335
4	Guangzhou	China	1296
5	Shenzhen	China	768
6	Chengdu	China	469
7	Nanjing	China	329
8	Chongqing	China	259
9	Suzhou	China	257
10	Wuhan	China	249

Ollama Top 10 Mainland China City Deployments

5.5 US-China Top 10 City Deployment Comparison

To facilitate a more direct comparison of Ollama deployment densities at the municipal level between the US and China, the table below juxtaposes the unique IP deployment counts for the top 10 cities in both nations:

Rank	US City (Top 10)	US Unique IP Count	China City (Top 10)	China Unique IP Count
1	Ashburn	5808	Beijing	3906
2	Portland	5130	Hangzhou	2447
3	Columbus	3672	Shanghai	1335
4	San Jose	2815	Guangzhou	1296
5	Seattle	2534	Shenzhen	768
6	Westlake Village	1714	Chengdu	469
7	Boardman	855	Nanjing	329
8	Florence	776	Chongqing	259
9	San Francisco	753	Suzhou	257
10	Boulder	642	Wuhan	249

Ollama US-China Top 10 City Deployment Comparison

Brief Analysis:

Leading City Deployment Volume: The top 3 US cities (Ashburn, Portland, Columbus) each exhibit over 3,000 unique IPs with Ollama deployments. China's foremost city (Beijing) exceeds 3,000 deployments, with its second city (Hangzhou) surpassing 2,000.
Technology and Economic Hubs: A significant number of the listed cities in both countries are recognized technology innovation centers or key economic regions.
Data Center Proximity: The inclusion of US cities such as Ashburn indicates a significant deployment footprint within cloud provider infrastructure and data centers.
Distribution Disparities: Cumulatively, the total IP count in the US Top 10 cities is substantially higher than in China's Top 10. However, both nations demonstrate a pattern where a few core urban centers account for the majority of Ollama deployments.

This city-level comparative analysis further indicates that the adoption of Ollama, as a developer-centric tool, correlates strongly with regional tech ecosystems and industry maturation.

6. Model Analysis

6.1 Overview of AI Models, Parameters, and Quantization

Ollama supports a diverse array of open-source Large Language Models. These models are typically differentiated by the following characteristics:

6.1.1 Common Model Families

The current open-source landscape features a proliferation of prominent LLM families, each with distinct attributes:

Llama Series (Meta AI): e.g., Llama 2, Llama 3, Code Llama. Renowned for robust general-purpose capabilities and extensive community support, fostering numerous fine-tuned derivatives. Models such as llama3.1, hermes3 identified in our dataset are often Llama-architecture based.
Mistral Series (Mistral AI): e.g., Mistral 7B, Mixtral 8x7B. Notable for efficiency and high performance benchmarks, particularly its Mixture of Experts (MoE) architectures.
Gemma Series (Google): e.g., Gemma 2B, Gemma 7B. Open-weight models released by Google, leveraging technology from their more powerful Gemini model series.
Phi Series (Microsoft): e.g., Phi-2, Phi-3. Focuses on compact yet performant models, emphasizing "Small Language Models (SLMs)".
DeepSeek Series (DeepSeek AI): e.g., DeepSeek Coder, DeepSeek LLM. PRC-developed AI models demonstrating strong capabilities in code generation and general-purpose tasks.
Qwen Series (Alibaba Tongyi Qianwen): e.g., Qwen1.5. A model series from Alibaba DAMO Academy, supporting multiple languages and tasks.
Numerous other notable models exist, including Yi (01.AI), Command R (Cohere), etc.

Ollama, via its Modelfile abstraction, enables users to readily utilize these base models or their fine-tuned variants. Model identifiers often adhere to the family:size-variant-quantization convention, e.g., llama3:8b-instruct-q4_K_M.

6.1.2 Model Parameters (Parameter Size)

The number of model parameters (typically denoted in Billions 'B' or Millions 'M') is a key indicator of a model's scale and potential capability. Common parameter sizes include:

Small Models: < 7B (e.g., 1.5B, 2B, 3B). Characterized by fast inference and low resource demands, suitable for specialized tasks or resource-constrained environments.
Medium Models: 7B, 8B, 13B. Offer a compelling balance between computational capability and resource demands, representing a highly popular segment within the community.
Large Models: 30B, 33B, 40B, 70B+. Generally exhibit superior capabilities but necessitate greater computational resources (RAM, VRAM) and entail longer inference latencies.

The parameter_size field in our dataset (e.g., "8.0B", "7B", "134.52M") quantifies this attribute.

6.1.3 Quantization Versions (Quantization Level)

Quantization is a technique employed to reduce model footprint and accelerate inference by lowering the numerical precision of model weights (e.g., from 16-bit floating-point FP16 to 4-bit integer INT4).

Prevalent Quantization Schemes: Ollama and the GGUF format (leveraged by Llama.cpp) support various quantization strategies, such as Q2_K, Q3_K_S, Q3_K_M, Q3_K_L, Q4_0, Q4_K_M, Q5_K_M, Q6_K, Q8_0.
- The numerical prefix (e.g., 2, 3, 4, 5, 6, 8) generally indicates the bit precision.
- K-series quantization (e.g., Q4_K_M) represents enhanced quantization methods introduced in llama.cpp, typically achieving superior performance at equivalent bit depths.
- _S, _M, _L suffixes usually denote K-quant variants affecting different model components.
- F16 (FP16) signifies 16-bit floating-point, often considered unquantized or a baseline quantization. F32 (FP32) denotes full precision.
Trade-off: Aggressive quantization (lower bit precision) yields reduced model footprint and inference latency, typically at the cost of some fidelity (model performance degradation). Users must select based on hardware constraints and model quality requisites.

The quantization_level field in our dataset (e.g., "Q4_K_M", "F16") specifies this.

6.2 Top Popular Model Names

The table below lists the Top 10 model tags ranked by unique IP deployments, including their associated family, parameter size, and quantization level metadata.

Rank	Model Name (model_name)	Unique IP Deployments	Total Deployment Instances
1	`llama3:latest`	12659	24628
2	`deepseek-r1:latest`	12572	24578
3	`mistral:latest`	11163	22638
4	`qwen:latest`	9868	21007
5	`llama3:8b-text-q4_K_S`	9845	20980
6	`smollm2:135m`	4058	5016
7	`llama2:latest`	3124	3928
8	`hermes3:8b`	2856	3372
9	`llama3.1:8b`	2714	3321
10	`qwen2.5:1.5b`	2668	3391

(Note: Unique IP Deployments: Count of unique IP addresses deploying at least one instance of this model tag. Total Deployment Instances: Aggregate count of this model tag across all models lists from all IPs. A single IP may reference the same model tag multiple times via different records or run multiple instances of distinct tags from the same base model.)

Initial Observations (Popular Model Names):

Models tagged with :latest are highly prevalent (e.g., llama3:latest, deepseek-r1:latest, mistral:latest, qwen:latest), suggesting a user preference for fetching the most recent model versions.
Llama series models (e.g., llama3:latest, llama3:8b-text-q4_K_S, llama2:latest, llama3.1:8b) occupy multiple top positions, underscoring their significant adoption.
PRC-developed AI models like deepseek-r1:latest (DeepSeek series) and qwen:latest (Tongyi Qianwen series) also demonstrate substantial adoption, securing high rankings.
Specific quantized versions, such as llama3:8b-text-q4_K_S, feature in the top ten, indicating user selection for specific performance/resource utilization profiles.
Compact models like smollm2:135m and qwen2.5:1.5b show considerable deployment volume, addressing requirements for resource-constrained or low-latency deployments.

6.3 Top Model Families

Model family (represented by the details.family field) denotes the base architecture or primary technological lineage. The following model families exhibit higher deployment counts based on our data analysis:

Rank	Model Family (family)	Unique IP Deployments (Estimated)	Total Deployment Instances (Estimated)
1	`llama`	~20250	~103480
2	`qwen2`	~17881	~61452
3	`nomic-bert`	~1479	~1714
4	`gemma3`	~1363	~2493
5	`bert`	~1228	~2217
6	`mllama`	~943	~1455
7	`gemma`	~596	~750
8	`deepseek2`	~484	~761
9	`phi3`	~368	~732
10	`gemma2`	~244	~680

(Note: Figures are estimates summarized from a query of the Top 50 model details and may exhibit slight deviations from precise global aggregates, but trends are representative.)

Initial Observations (Popular Model Families):

The llama family holds a dominant position, consistent with Llama series models serving as foundational architectures for many open-source LLMs and their direct widespread application. Its extensive ecosystem and numerous fine-tuned derivatives solidify its status as the most popular choice.
qwen2 (Tongyi Qianwen Qwen2 series), as the second largest family, demonstrates strong market competitiveness within China and globally.
The presence of nomic-bert and bert is notable. While not typically classified as conversational LLMs, but rather as text embedding or other foundational NLP models, their high deployment volume indicates Ollama's utility extends to these NLP tasks. Ollama's automatic download of a default embedding model (e.g., nomic-embed-text) for certain operations (e.g., embedding vector generation) is a primary contributor to their high ranking.
Google's gemma series (encompassing gemma3, gemma, gemma2) also shows significant adoption rates.
Other prominent model families such as deepseek2 and phi3 are present in the top ten.
mllama may represent an aggregation of various Llama-based hybrid, modified, or community-designated models.

6.4 Top Original Parameter Size Statistics

Model parameter size (details.parameter_size field) is a key metric of model scale. Due to varied string representations of parameter sizes in the raw data (e.g., "8.0B", "7B", "134.52M"), we perform a direct count of these original strings. The following parameter size representations show higher deployment numbers:

Rank	Parameter Size (Original String)	Unique IP Deployments (Estimated)	Total Deployment Instances (Estimated)
1	`8.0B`	~14480	~52577
2	`7.6B`	~14358	~28105
3	`7.2B`	~11233	~22907
4	`4B`	~9895	~21058
5	`7B`	~4943	~11738
6	`134.52M`	~4062	~5266
7	`1.5B`	~2759	~3596
8	`13B`	~2477	~3311
9	`1.8B`	~2034	~2476
10	`3.2B`	~1553	~2244
11	`137M`	~1477	~1708
12	`12.2B`	~1421	~2000
13	`32.8B`	~1254	~2840
14	`14.8B`	~1123	~2091
15	`4.3B`	~943	~1194

Ollama Top Original Parameter Size Statistics

(Note: Values are estimated based on a summary of parameter information from the previously queried Top 50 model details list.)

Initial Observations (Popular Parameter Sizes):

Models in the 7B to 8B parameter range are predominant: "8.0B", "7.6B", "7.2B", "7B" account for the majority of deployments. This typically corresponds to highly popular community models like Llama 2/3 7B/8B series, Mistral 7B, and their various fine-tuned derivatives, which offer a compelling balance between performance and resource demands.
4B scale models also hold a significant share: The high deployment volume of "4B" models is noteworthy.
Million-parameter (M-scale) lightweight models are widely adopted: The high ranking of "134.52M" and "137M" is likely attributable to the popularity of embedding models (e.g., nomic-embed-text) or very small specialized models (e.g., smollm series). These models are compact, fast, and suitable for resource-constrained or latency-sensitive applications.
Consistent demand for small models in the 1B-4B range: Models with parameter sizes such as "1.5B", "1.8B", "3.2B", "4.3B" are adopted by a specific user segment.
Large models exceeding 10B parameters: Models such as "13B", "12.2B", "32.8B", "14.8B", while having fewer unique IP deployments compared to the 7-8B tier, still exhibit considerable deployment volume, indicating community demand for more capable models, despite their increased hardware prerequisites.

6.5 Top Quantization Level Statistics

Model quantization level (details.quantization_level field) reflects the weight precision adopted to reduce model size and accelerate inference. Below are the quantization levels with higher deployment counts:

Rank	Quantization Level (Original String)	Unique IP Deployments (Estimated)	Total Deployment Instances (Estimated)
1	`Q4_K_M`	~20966	~53688
2	`Q4_0`	~18385	~88653
3	`Q4_K_S`	~9860	~21028
4	`F16`	~5793	~9837
5	`Q8_0`	~596	~1574
6	`unknown`	~266	~1318
7	`Q5_K_M`	~97	~283
8	`F32`	~85	~100
9	`Q6_K`	~60	~178
10	`Q2_K`	~54	~140

Ollama Top Quantization Level Statistics

(Note: Values are estimated based on a summary of quantization information from the previously queried Top 50 model details list.)

Initial Observations (Popular Quantization Levels):

4-bit quantization is the dominant strategy: Q4_K_M, Q4_0, and Q4_K_S, these three 4-bit quantization schemes, dominate the deployment statistics. This clearly indicates widespread community adoption of 4-bit quantization as the preferred strategy for optimizing the balance between model performance, inference speed, and resource footprint (particularly VRAM).
F16 (16-bit floating-point) maintains a significant presence: As an unquantized (or minimally quantized) version, the high deployment of F16 suggests a considerable user base prioritizing maximum model fidelity or possessing adequate hardware resources.
Q8_0 (8-bit quantization) serves as a supplementary option: It offers an intermediate option between 4-bit and FP16 precision.
Presence of unknown values: Indicates missing or non-standardized entries in model metadata for quantization level.

6.6 Distribution of AI Compute Capacity (by Model Parameter Size): China vs. USA

To provide a granular analysis of how models of varying scales are deployed in major countries, we categorized and aggregated the parameter sizes of models deployed on Ollama instances in the United States and China. Parameter size is often used as a key proxy for model complexity and requisite AI compute capacity.

Parameter Scale Classification Schema:

Small: < 1 Billion parameters (< 1B)
Medium: 1 Billion to < 10 Billion parameters (1B to < 10B)
Large: 10 Billion to < 50 Billion parameters (10B to < 50B)
Extra Large: >= 50 Billion parameters (>= 50B)

The table below details the number of unique IPs deploying models of different parameter scales in the US and China:

Country	Parameter Scale Category	Unique IP Count
China	Small (<1B)	3313
China	Medium (1B to <10B)	4481
China	Large (10B to <50B)	1548
China	Extra Large (>=50B)	280
United States	Small (<1B)	1368
United States	Medium (1B to <10B)	6495
United States	Large (10B to <50B)	1301
United States	Extra Large (>=50B)	58

The table below shows the number of unique IPs deploying models of different parameter scales in the US and China

Data Insights and Analysis:

Medium-sized models are mainstream, with differing strategic emphases:
- United States: Deployments of medium-sized models (1B-10B) are absolutely dominant in the US (6495 unique IPs).
- China: Medium-sized models (4481 unique IPs) are also the most deployed category in China; however, the deployment of small models (<1B) in China (3313 unique IPs) is very substantial.
Notable divergence in small model deployments: China's extensive deployment of small models may reflect a strategic focus on edge AI, mobile-first AI applications, and analogous use-cases.
Deployment of large and extra-large models: China exhibits greater exploratory activity with large and extra-large models, albeit from a smaller deployment baseline relative to medium models.
Implications for overall AI compute investment: The US concentration in medium-sized models indicates a mature adoption phase focused on practical AI applications. China demonstrates a strong position in small model deployments and active exploration of larger model architectures.
Implications for global trends: Medium-sized models are likely popular globally. Regional model adoption strategies likely diverge based on local ecosystem maturity and resource availability.

This segmented analysis of model parameter scales in China and the US reveals distinct strategic focuses and development trajectories for Ollama applications in these two key regions.

7. Network Insights

7.1 Port Usage

11434 (default port): The predominant deployment configuration (30,722 unique IPs) utilizes the default port 11434 for Ollama instances.
Other common ports: Ports such as 80 (1,619 unique IPs), 8080 (1,571 unique IPs), 443 (1,339 unique IPs), etc., are also utilized, which may suggest deployment behind reverse proxies or user-configured port assignments.

7.2 Protocol Usage

HTTP: Approximately 65,506 unique IPs host instances serving via the HTTP protocol.
HTTPS: Approximately 43,765 unique IPs host instances serving via the HTTPS protocol.

A majority of instances remain exposed over unencrypted HTTP, presenting potential security vulnerabilities. (Note: A single IP may support both HTTP and HTTPS, thus the sum of IP counts may exceed the total unique IP count.)

7.3 Main Hosting Providers (AS Organization)

Ollama instance hosting exhibits a high concentration within major Cloud Service Provider (CSP) networks and telecommunications infrastructure.

Rank	AS Organization	Unique IP Count	Main Associated Provider
1	AMAZON-02	53658	AWS
2	AMAZON-AES	5539	AWS
3	Chinanet	4964	China Telecom
4	Hangzhou Alibaba Advertising Co.,Ltd.	2647	Alibaba Cloud
5	HENGTONG-IDC-LLC	2391	Hosting Provider
6	Shenzhen Tencent Computer Systems Company Limited	1682	Tencent Cloud
7	CHINA UNICOM China169 Backbone	1606	China Unicom
8	Hetzner Online GmbH	972	Hetzner
9	China Unicom Beijing Province Network	746	China Unicom (Beijing)
10	LEASEWEB-USA-LAX	735	Leaseweb

Ollama instance hosting is highly concentrated among cloud service providers

AWS (AMAZON-02, AMAZON-AES) commands the largest market share, followed by major Chinese telecommunications operators and CSPs (e.g., Alibaba Cloud, Tencent Cloud). Other hosting providers like Hetzner and Leaseweb also maintain significant shares.

8. Security and Other Observations

Version Information: Ollama version statistics are omitted from this report due to security considerations.
HTTP Exposure Risk: As previously noted, a large number of Ollama instances are exposed via HTTP without TLS encryption. This renders communication payloads (e.g., model interactions) susceptible to interception or modification. Implementation of a reverse proxy with HTTPS/TLS termination is strongly recommended.
API Accessibility: The data in this report is predicated on Ollama instances with publicly accessible /api/tags endpoints. The true deployment count is potentially higher, as some instances may reside within private networks or have ingress access restricted by firewall policies.

9. Conclusion and Brief Review

This report, by analyzing data from 99,412 globally publicly accessible Ollama instances (via their /api/tags interface), yields the following key conclusions and observations:

1. Global Deployment Overview and Geographical Distribution:

Ollama, as a streamlined utility for local LLM execution, has achieved widespread global deployment. This analysis identified 99,412 publicly accessible unique IPs.
Significant Geo-Concentration: The United States and China are the two countries/regions with the most concentrated Ollama deployments, collectively representing a substantial share of total accessible instances (US 29,195, China 16,464). Nations like Japan, Germany, the UK, India, and Singapore also exhibit notable deployment figures.
Urban Hotspots: In the US, cities like Ashburn, Portland, and Columbus lead in deployment density. In China, technologically-hub cities such as Beijing, Hangzhou, Shanghai, and Guangzhou are primary deployment loci. This often correlates with concentrations of technology firms, data center infrastructure, and developer ecosystems.

2. AI Model Deployment Trends:

Popular Model Tags: Generic latest tags such as llama3:latest, deepseek-r1:latest, mistral:latest, qwen:latest exhibit highest popularity. Specifically optimized versions like llama3:8b-text-q4_K_S are also favored for their optimized balance of performance and resource usage.
Dominant Model Families: The llama family demonstrates clear market leadership, followed by qwen2. The high ranking of embedding model families like nomic-bert and bert is noteworthy, potentially attributable to Ollama's default behavior or bootstrapping processes for embedding generation.
Parameter Size Preferences: Models with 7B-8B parameters represent the current mainstream adoption. Lightweight models at the million-parameter scale and large models exceeding 10B parameters cater to distinct market segments. A US-China comparison indicates US deployments favor medium-sized models, whereas PRC deployments show greater activity in small and extra-large model categories.
Quantization Level Choices: 4-bit quantization (particularly Q4_K_M and Q4_0) is the predominant choice. F16, as a higher-fidelity alternative, also maintains an important position.
Metadata Complexity: Analysis of model metadata (e.g., interpretation of the details.family field) occasionally reveals discrepancies or ambiguities relative to model names or common understanding, underscoring the heterogeneity in metadata management within the open-source ecosystem.

3. Technical Infrastructure:

Hosting Environments: A significant volume of Ollama instances are hosted within major CSPs like AWS, Alibaba Cloud, Tencent Cloud, and within the networks of major national telecommunications operators.
Service Ports: Ollama's default port 11434 is the predominantly utilized configuration, though a considerable number of instances are also exposed via standard web ports (80, 443, 8080).

4. Objective Assessment:

Popularity of Ollama: The data clearly demonstrates Ollama's significant adoption within developer and AI practitioner communities worldwide.
Vitality of the Open-Source Ecosystem: The diversity of popular models and the widespread use of various parameter and quantization versions reflect the rapid evolution of the open-source AI model ecosystem.
User Preference for Balanced Solutions: In model selection, users tend to prioritize a balance between model capability, operational efficiency, and TCO/hardware expenditure.
Security Considerations and Openness: A large number of instances permit public access to their model lists which, while beneficial for community accessibility, also introduces potential security exposures.

5. Future Outlook:

The proliferation of more efficient, compact models and advancements in quantization techniques are anticipated to further lower the barrier to Ollama deployment.
Standardization of model metadata and community-driven model sharing initiatives are pivotal for enhancing ecosystem transparency and usability.

In summary, Ollama is emerging as a key enabler, bridging advanced Large Language Models with a diverse user base of developers, researchers, and end-users. This data analysis provides valuable telemetry for assessing its current global deployment landscape and user adoption patterns.