We consider a seller's dynamic pricing problem with demand learning and reference effects. We first study the case where customers are loss averse: they have a reference price that can vary over time, and the demand reduction when the selling price exceeds the reference price dominates the demand increase when the selling price falls behind the reference price by the same amount. Thus, the expected demand as a function of price has a time-varying "kink" and is not differentiable everywhere. The seller neither knows the underlying demand function nor observes the time-varying reference prices. In this setting, we design and analyze a pricing policy that (i) changes the selling price very slowly to control the evolution of the reference prices, and (ii) gradually accumulates sales data to balance the tradeoff between learning and earning. We prove that, under a variety of reference-price updating mechanisms, our policy is asymptotically optimal; i.e., its T-period revenue loss relative to a clairvoyant who knows the demand function and the reference-price updating mechanism grows at the smallest possible rate in T. Conceptually, we show that the effect of unobservable parameters can sometimes be mitigated by a smart policy. We also extend our analysis to the case where customers are gain seeking and design asymptotically optimal policies for that case. Here, a surprising outcome is that the optimal growth rate of the regret is parameter-dependent in a nontrivial way.