We consider an online retailer facing heterogeneous customers with initially unknown product preferences. Customers are characterized by a diverse set of demographic and transactional attributes. The retailer can personalize the customers’ assortment offerings based on available profile information to maximize cumulative revenue. To that end, the retailer must estimate customer preferences by observing transaction data. This, however, may require a considerable amount of data and time given the broad range of customer profiles and large number of products available. At the same time, the retailer can aggregate (pool) purchasing information among customers with similar product preferences to expedite the learning process. We propose a dynamic clustering policy that estimates customer preferences by adaptively adjusting customer segments (clusters of customers with similar preferences) as more transaction information becomes available. We test the proposed approach with a case study based on a dataset from a large Chilean retailer. The case study suggests that the benefits of the dynamic clustering policy under the MNL model can be substantial and result (on average) in more than 37% additional transactions compared to a data-intensive policy that treats customers independently and in more than 27% additional transactions compared to a linear-utility policy that assumes that product mean utilities are linear functions of available customer attributes. We support the insights derived from the numerical experiments by analytically characterizing settings in which pooling transaction information is beneficial for the retailer, in a simplified version of the problem. We also show that there are diminishing marginal returns to pooling information from an increasing number of customers.