There was already news on Habr about this significant event. Indeed, it resembles a retelling of the official Microsoft press release, but that's how the 'news' should be.
My Early Attempts
For some reason that even I don't understand, I started trying to use Python inside Excel a few weeks ago. Unsuccessfully. Despite all my efforts, it didn't work: initially, I connected to Microsoft 365 Insider, Beta Channel and didn't observe the option there. Then I downloaded an add-in from Excel Labs, the Microsoft Garage project. The feature, although it started, did not work for me, constantly returning errors.
The Big Update
And then, on 12.10.2023, I received an email with good news: after updating Office, there is now a "Python in Excel" option in the Formulas menu. Entering "=PY(" in the command line converts the current cell into something resembling a Jupyter Notebook cell.
Important Reminder: To access and use the "Python in Excel" feature, you need to be part of the "Microsoft 365 Insider, Beta Channel."
How Does It Work?
The code is executed on Microsoft servers. All you need to do on your local computer is start Excel and enter the familiar magical combination "=PY(". Then you can enter the code. Of course, copy-paste works. The syntax is Python, with a slight specificity if you need to access data. The 'Ctrl+Enter' command ignites the code.
The Big Question: Why?
Why do we need Python inside Excel at all? This is the main question, a huge “Why?” Are there any advantages over using the two products independently and exchanging data between them when necessary? Possibly there are, but with a big question mark. And despite all my heavy pondering, I can't find an answer, although I'm trying very hard.
Potential Advantages
- You can avoid installing Python altogether.
- For simple tasks that are more efficiently solved by code than by built-in Excel tools, you don't need to multiply entities: both code and data can live inside one file.
- Another reason to utilize this feature is as a substitute for Microsoft Excel macros, especially if you're well-acquainted with Python but not with macros. Thanks to @alexandershelupinin who pointed this out.
- And perhaps something else that I'm not aware of or haven't thought of.
My Personal Take
Thinking soberly, I can't come up with anything rational and sensible, which would require forgetting Python in Excel as an unnecessary toy. But! Procrastination and/or curiosity force me to figure out the feature and come up with scenarios for using it. A man is weak (I definitely am), can't resist temptations. The result of my weakness is this material.
First Impressions
According to my disgusting habit, instead of starting the process by carefully reading relevant documentation, I started exploring the feature by using the simplest and most effective of all methods, namely the "trial and error method". Which, as often happens, led to several hours of bouncing from one elementary mistake to another and a waste of time and nervous energy. Common sense prevailed and - wonder of wonders! - after reading the documentation (see links below), the mystery was solved.
As it turned out, the fact that the code runs on Microsoft's servers has side effects: time delay, can't install third party libraries, accessing data outside the current workbook only via Power Query, can't save data outside the current workbook, etc.
Plus, Python syntax is often with adjustments for "living inside Excel". For example, to assign the value from cell B2 to a variable, we write x = xl("B2"). And of course, writing code inside Excel's formula bar is a pleasure for perverts. Although there is nothing stopping you from writing in a text editor and then copy-pasting. With all the big and small "BUTs".
Let me try it on a real-life use case.
Epic Fail
Since mid-January this year, 2023, I've been having a peculiar romance with Garmin, specifically its sports watches, mobile app, and personal account. This wasn't your everyday romance; it was a dance of numbers and data. Every week, like clockwork, I would download and process various sets of data—workouts, breathing exercises, and more. Then, I would merge these with records of my resting heart rate and HRV. My trusty Python script managed all this seamlessly. But one day, in a mix of curiosity and perhaps a dash of procrastination, I wondered: Could Excel's new Python feature handle this?
With optimism, I dove in. I imagined my CSV file, brimming with raw data from the Activities section of my Garmin account, being effortlessly tamed by this new Excel Python integration. I dreamt of easy summarizations, data pulls, merges, and updates. But reality had other plans. After several grueling hours, I had to admit defeat and put this challenge aside.
And Now It Worked
Working with CSV: reading and processing data
As part of my “romantic” routine I need to supplement my Garmin data with heart rate readings from the Elite HRV app. The non-PRO version of Elite HRV doesn't allow for direct data exports, but with a simple request, they send you a link to download an archive of TXT files. These files contain RR intervals—each representing the milliseconds between heartbeats. A cardiologist may task you with extracting heart rate readings from these intervals and plotting a graph to show heart rate fluctuations over time.
With renewed determination, I tackled the challenge using Python in Excel. There were a few hiccups, especially with the query creation, but in the end, it took me just about fifteen minutes to draft a rather intricate code.
# read data
rr_intervals = xl("EliteHRV_SS")
rr_intervals.columns = ['rr_intervals']
rr_intervals['rr_intervals'] = pd.to_numeric(rr_intervals['rr_intervals'])
# calculating heart rate for each interval
rr_intervals['heart_rate'] = (60000 / rr_intervals['rr_intervals']).astype(int)
# Calculate cumulative time in seconds for each RR interval
cumulative_time = np.cumsum(rr_intervals['rr_intervals'].values) / 1000 # Convert from ms to s
cumulative_time = [int(t) for t in cumulative_time]
heart_rates = rr_intervals['heart_rate'].values
df = pd.DataFrame({
'timepoints': cumulative_time,
'heart_rate': heart_rates
})
# Plotting
plt.figure(figsize=(15, 6))
plt.plot(cumulative_time, heart_rates, label='Heart Rate', color='blue')
plt.xlabel('Time (seconds)')
plt.ylabel('Heart Rate (bpm)')
plt.title('Heart Rate Over Time')
plt.legend()
plt.grid(True)
plt.show()
The result? A processed dataset and a beautifully rendered graph showcasing the rhythm of my heart.
Some Feature Specifics
In addition to specifics in working with data, especially external, and the need for workarounds to save data outside the current workbook, I found some other features.
There is no need to import libraries, if those (libraries) are on dedicated for the job Microsoft servers. Note - in the `pd.to_numeric(rr_intervals['rr_intervals'])` line, Pandas' `to_numeric` is called, but Pandas is not imported. The same is true for calling `plt`.
Autocomplete works, but not as nicely as we're used to.
I'm sure the list of specifics is much longer.
Will I Use the Feature?
Reflecting on this journey, does the Python in Excel feature captivate me? The answer is more "yes" than "no". For simple tasks, like basic processing of a small dataset, it's a perfectly workable tool. But as soon as the task gets a little more complicated, regular Python (Anaconda in my case) is much more convenient and efficient.
What about you? Will you try it?