Next-basket recommendation (NBR) is a recommendation task that predicts a basket or a set of items a user is likely to adopt next based on his/her history of basket adoption sequences. It enables a wide range of novel applications and services from predicting next basket of items for grocery shopping to recommending food items a user is likely to consume together in the next meal. Even though much progress has been made in the algorithmic NBR research over the years, little research has been done to broaden knowledge about the evaluation of NBR methods, which largely based on the offline evaluation experiments and binary relevance paradigm. Specifically, we argue that recommended baskets which are more similar to ground-truth baskets are better recommendations than those that share little resemblance to the ground truth, and therefore they should be granted some partial credits. Based on this notion of non-binary relevance assessment, we propose new evaluation metrics for NBR by adapting and extending similarity metrics from natural language processing (NLP) and text classification research. To validate the proposed metrics, we conducted two user studies on the next-meal food recommendation using numerous state-of-the-art NBR methods in both online and offline evaluation settings. Our findings show that the the offline performance assessment based on the proposed non-binary evaluation metrics is more representative of the online evaluation performance than that of the standard evaluation metrics.