• /
  • /
Automating Sales Forecasting for New Products
Introducing new items into a retail assortment is a challenge familiar to both manufacturers and retailers. Retailers, with their vast volumes of data, can estimate sales of new products based on their characteristics. Yet this process is slow, labor‑intensive, and only partially automated, requiring analysts to manually gather statistics on similar products or categories. This article outlines a method for automating the estimation of future sales for products with no historical data.
Cold‑Start Forecasting Using Traditional Methods
When a product has no sales history, the forecasting task is known as the cold start problem. In such cases, analysts rely on general product attributes such as category, weight, and price.

When introducing a new product to the market, retailers typically perform three key tasks:
  • Identifying similar products — usually done manually
  • Collecting sales statistics for these analogs
  • Estimating sales of the new product based on statistics and expert judgment
Below, we explore how this workflow can be automated using Natural Language Processing (NLP).
Machine Learning and Large Language Models as the Foundation
Machine learning (ML) and large language models (LLMs) can significantly streamline the process by automatically extracting product features from names and descriptions.

Step‑by‑Step Concept
  1. Feature generation: An LLM extracts structured attributes from product descriptions. These attributes can be predefined by the business (e.g., product group, price tier, weight).
  2. Automated analog search: Based on extracted attributes, the system retrieves sales data for similar products. This step can be fully automated by defining a data‑extraction pipeline.
  3. Sales statistics construction: The system builds aggregated statistics for analog products, including monthly demand and seasonality.
  4. Feature set formation: The extracted LLM features are combined with traditional aggregated metrics from similar products and categories.
  5. Forecast model creation: A machine learning model predicts demand for the new product.
  6. Business validation: Forecasts are adjusted using business rules such as minimum stock levels, logistics constraints, shelf‑space limits, and contractual obligations.
Solution Architecture
The full solution can be broken down into functional modules:
  • Data sources: product cards, sales history, promo history, planned promotions
  • LLM module: feature extraction
  • Analog search module
  • Forecasting module: ML model plus business rules
  • Interface: API or dashboard
Conclusion
Automating the introduction of new products is a major step toward more flexible and accurate assortment management. Machine learning enables retailers to react faster to market changes and reduce operational costs. As assortments grow and turnover accelerates, manual demand estimation becomes a bottleneck that limits scalability.

A fully automated system also unlocks additional benefits:
  • Scalability: the ability to process thousands of new products without expanding the analytics team
  • Process integration: automatic forecast delivery to ERP/CRM systems, procurement planning, and logistics
  • Adaptability: rapid re‑estimation of demand when descriptions, prices, or supply conditions change
  • Feedback loops: continuous improvement of model accuracy through accumulated forecast errors

Machine learning transforms product descriptions into structured features, automatically identifies relevant analogs, and builds demand forecasts that account for seasonality, pricing, and logistical constraints. This reduces reliance on expert judgment, minimizes human error, and accelerates the launch of new products.