Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question - How to read color object in a cell but not color itself #112

Closed
SSMK-wq opened this issue Feb 22, 2022 · 5 comments
Closed

Question - How to read color object in a cell but not color itself #112

SSMK-wq opened this issue Feb 22, 2022 · 5 comments
Labels

Comments

@SSMK-wq
Copy link

SSMK-wq commented Feb 22, 2022

Hi, I am new to StyleFrame and currently using python 3.7, pandas 1.3.4, and openpyxl 3.0.9

I am currently encountering a scenario where my excel cell has colored object (circles) but the cell itself is not colored. I have attached a sample file with data

DUMMY_DATA_TEST.xlsx

image

each color has a score. ex: red color = 0, green color = 90 and Tacao = 50

So, for each row, I would like to compute the average score and store it in a new column.

For instance,
1st row contains 4 green, 4 red, 2 tacao
2nd row contains 2 green, 3 red, 3 tacao
based on the score for each color,
1st row average = 46
2nd row average = 41.25

As a first step, I tried the below to read the excel using styleframe

from styleframe import StyleFrame, utils
# from StyleFrame import StyleFrame, utils (if using version < 3.X)

sf = StyleFrame.read_excel('DUMMY_DATA_TEST.xlsx', read_style=True)
print(sf)
sf = sf[[col for col in sf.columns if col.style.fill.fgColor.rgb in ('#FF0000', utils.colors.red)]]

but the issue is it doesn't get read properly.

Is there any suggestion that you have on whether it can be achieved using styleframe?

If you are interested to know what I am trying to achieve, it is given below

image

@DeepSpace2
Copy link
Owner

DeepSpace2 commented Feb 22, 2022

The first step, even before trying to deal with the colors, is to make sure that we can read the sheet correctly without losing the complex format, which I suspect is not the case.

styleframe uses pandas.read_excel to read files. If we put it aside for a sec and try to see how pandas handles this file we will see that the complex formatting (ie all the merged cells) is lost:

import pandas as pd

pd.read_excel('DUMMY_DATA_TEST.xlsx').to_excel('output.xlsx')

When we open output.xlsx we will see the formatting is lost, so there is not much that styleframe can do about that.

I previously had a go at adding support for reading sheets with merged cells (see #76). Can you confirm if that workaround works in your case?

@SSMK-wq
Copy link
Author

SSMK-wq commented Feb 22, 2022

@DeepSpace2 - I am unable to install using the below command - pip install -U git+https://github.com/DeepSpace2/StyleFrame@preserving-merged-cells. Is there any other way to test this?

@DeepSpace2
Copy link
Owner

Are you getting an error?

@SSMK-wq
Copy link
Author

SSMK-wq commented Feb 23, 2022

@DeepSpace2 - Yes, I do get an error. Sorry, am not a software developer. hence i might be doing something incorrect as well

@stale
Copy link

stale bot commented Apr 24, 2022

This issue has been automatically marked as stale because it has not had activity in the last 60 days.

@stale stale bot added the stale label Apr 24, 2022
@stale stale bot closed this as completed May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants