Automating Sales Forecasting for New Products

Main
/
Blog
/
Big Data Processing: From Concepts to Cloud

Introducing new items into a retail assortment is a challenge familiar to both manufacturers and retailers. Retailers, with their vast volumes of data, can estimate sales of new products based on their characteristics. Yet this process is slow, labor‑intensive, and only partially automated, requiring analysts to manually gather statistics on similar products or categories. This article outlines a method for automating the estimation of future sales for products with no historical data.

Cold‑Start Forecasting Using Traditional Methods

When a product has no sales history, the forecasting task is known as the cold start problem. In such cases, analysts rely on general product attributes such as category, weight, and price.

When introducing a new product to the market, retailers typically perform three key tasks:

Identifying similar products — usually done manually
Collecting sales statistics for these analogs
Estimating sales of the new product based on statistics and expert judgment

Below, we explore how this workflow can be automated using Natural Language Processing (NLP).

Machine Learning and Large Language Models as the Foundation

Machine learning (ML) and large language models (LLMs) can significantly streamline the process by automatically extracting product features from names and descriptions.

Step‑by‑Step Concept

Feature generation: An LLM extracts structured attributes from product descriptions. These attributes can be predefined by the business (e.g., product group, price tier, weight).
Automated analog search: Based on extracted attributes, the system retrieves sales data for similar products. This step can be fully automated by defining a data‑extraction pipeline.
Sales statistics construction: The system builds aggregated statistics for analog products, including monthly demand and seasonality.
Feature set formation: The extracted LLM features are combined with traditional aggregated metrics from similar products and categories.
Forecast model creation: A machine learning model predicts demand for the new product.
Business validation: Forecasts are adjusted using business rules such as minimum stock levels, logistics constraints, shelf‑space limits, and contractual obligations.

Solution Architecture

The full solution can be broken down into functional modules:

Data sources: product cards, sales history, promo history, planned promotions
LLM module: feature extraction
Analog search module
Forecasting module: ML model plus business rules
Interface: API or dashboard

Conclusion

Automating the introduction of new products is a major step toward more flexible and accurate assortment management. Machine learning enables retailers to react faster to market changes and reduce operational costs. As assortments grow and turnover accelerates, manual demand estimation becomes a bottleneck that limits scalability.

A fully automated system also unlocks additional benefits:

Scalability: the ability to process thousands of new products without expanding the analytics team
Process integration: automatic forecast delivery to ERP/CRM systems, procurement planning, and logistics
Adaptability: rapid re‑estimation of demand when descriptions, prices, or supply conditions change
Feedback loops: continuous improvement of model accuracy through accumulated forecast errors

Machine learning transforms product descriptions into structured features, automatically identifies relevant analogs, and builds demand forecasts that account for seasonality, pricing, and logistical constraints. This reduces reliance on expert judgment, minimizes human error, and accelerates the launch of new products.